[ 
https://issues.apache.org/jira/browse/PARQUET-406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert reassigned PARQUET-406:
--------------------------------------

    Assignee: Reuben Kuhnert

> Counter Initialization causes NPE
> ---------------------------------
>
>                 Key: PARQUET-406
>                 URL: https://issues.apache.org/jira/browse/PARQUET-406
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Reuben Kuhnert
>            Assignee: Reuben Kuhnert
>
> {code}
> CREATE EXTERNAL TABLE api_hit_parquet_test ROW FORMAT SERDE 
> 'com.foursquare.hadoop.hive.serde.RecordV2SerDe' WITH SERDEPROPERTIES 
> ('serialization.class' = 'com.foursquare.logs.gen.ApiHit') STORED AS 
> INPUTFORMAT 'com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
> LOCATION '/user/bly/api_hit_parquet' TBLPROPERTIES 
> ('thrift.parquetfile.input.format.thrift.class' = 
> 'com.foursquare.logs.gen.ApiHit’)
> {code}
> The table is successfully created, and I can verify the schema is correct by 
> running DESCRIBE FORMATTED on it. However, when I try to do a simple SELECT * 
> on the table, I get the following stack trace:
> {code}
> java.io.IOException: java.lang.RuntimeException: Could not read first record 
> (and it was not an EOF)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
> Caused by: java.lang.RuntimeException: Could not read first record (and it 
> was not an EOF)
>         at 
> com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.initKeyValueObjects(DeprecatedInputFormatWrapper.java:280)
>         at 
> com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.createValue(DeprecatedInputFormatWrapper.java:297)
>         at 
> com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat$$anon$1.<init>(HiveThriftParquetInputFormat.scala:47)
>         at 
> com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat.getRecordReader(HiveThriftParquetInputFormat.scala:46)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:667)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
>         ... 9 more
> Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file 
> hdfs://hadoop-alidoro-nn-vip/user/bly/api_hit_parquet/part-m-00000.parquet
>         at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:243)
>         at 
> org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:227)
>         at 
> com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.initKeyValueObjects(DeprecatedInputFormatWrapper.java:271)
>         ... 15 more
> Caused by: java.lang.NullPointerException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at 
> org.apache.parquet.hadoop.util.ContextUtil.invoke(ContextUtil.java:264)
>         at 
> org.apache.parquet.hadoop.util.ContextUtil.incrementCounter(ContextUtil.java:273)
>         at 
> org.apache.parquet.hadoop.util.counters.mapreduce.MapReduceCounterAdapter.increment(MapReduceCounterAdapter.java:38)
>         at 
> org.apache.parquet.hadoop.util.counters.BenchmarkCounter.incrementTotalBytes(BenchmarkCounter.java:78)
>         at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:497)
>         at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:130)
>         at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:214)
>         ... 17 more
> {code}
> I have spent some time following this stack trace, and it appears that the 
> error lies in the Counter code, which is odd because I don’t do anything with 
> that. Is there some way I need to initialize counters?
> To be specific, I have found that MapReduceCounterAdapter is being created 
> with a null parameter. Here is the constructor:
> {code}
> public MapReduceCounterAdapter(Counter adaptee) {
>     this.adaptee = adaptee;
>   }
> {code}
> So adaptee is being passed as null, and then getting called later on, causing 
> my NullPointerException.
> The adaptee parameter is created by this method:
> {code}
> public static Counter getCounter(TaskInputOutputContext context,
>                                    String groupName, String counterName) {
>     return (Counter) invoke(GET_COUNTER_METHOD, context, groupName, 
> counterName);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to