Hi Josh, Can you enable the debug logging in conf/hive-log4j.xml? Change the WARN,DRFA to DEBUG,console
And then rerun the query. Then it should tell us the full stack trace and we can start debugging from there. Sorry for the problem. We will try to figure it out once we get enough information. Zheng On Thu, Dec 11, 2008 at 7:57 PM, Josh Ferguson <[email protected]> wrote: > Is there any word on what we should do with this? It's sort of blocking me > from doing things. Anyone have any leads? > > Josh > > > On Dec 10, 2008, at 8:32 PM, Josh Ferguson wrote: > > Forgot about this one >> >> hive> describe extended aggregations; >> OK >> value int >> account string >> application string >> dataset string >> hour int >> aggregation string >> aggregated_by string >> Detailed Table Information: >> Table(tableName:aggregations,dbName:default,owner:Josh,createTime:1228463486,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:value,type:int,comment:null)],location:/user/hive/warehouse/aggregations,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null), >> FieldSchema(name:application,type:string,comment:null), >> FieldSchema(name:dataset,type:string,comment:null), >> FieldSchema(name:hour,type:int,comment:null), >> FieldSchema(name:aggregation,type:string,comment:null), >> FieldSchema(name:aggregated_by,type:string,comment:null)],parameters:{}) >> Time taken: 2.884 seconds >> >> On Dec 10, 2008, at 8:31 PM, Josh Ferguson wrote: >> >> hive> SELECT COUNT(actor_id) AS value FROM activities WHERE ( >>> > account='80c27664-b047-4c0a-86f3-342c0cdf36c7' AND >>> application='myproduct' >>> > AND dataset='purchase' AND hour=341165 ); >>> >>> This actually works fine, it's writing to the new table that is broken. >>> >>> hive> describe extended activities; >>> OK >>> actor_id string >>> actee_id string >>> properties map<string,string> >>> account string >>> application string >>> dataset string >>> hour int >>> Detailed Table Information: >>> Table(tableName:activities,dbName:default,owner:Josh,createTime:1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:actor_id,type:string,comment:null), >>> FieldSchema(name:actee_id,type:string,comment:null), >>> FieldSchema(name:properties,type:map<string,string>,comment:null)],location:/user/hive/warehouse/activities,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[actor_id, >>> actee_id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null), >>> FieldSchema(name:application,type:string,comment:null), >>> FieldSchema(name:dataset,type:string,comment:null), >>> FieldSchema(name:hour,type:int,comment:null)],parameters:{}) >>> Time taken: 2.656 seconds >>> >>> I' not sure what describe extended partition is, doesn't work for me >>> >>> Josh >>> >>> On Dec 10, 2008, at 2:31 PM, Raghu Murthy wrote: >>> >>> Can you check the output of the query without the insert clause? >>>> >>>> SELECT COUNT(actor_id) AS value FROM activities WHERE ( >>>> account='80c27664-b047-4c0a-86f3-342c0cdf36c7' AND >>>> application='myproduct' >>>> AND dataset='purchase' AND hour=341165 ) >>>> >>>> Is it empty? >>>> >>>> >>>> On 12/10/08 1:08 AM, "Josh Ferguson" <[email protected]> wrote: >>>> >>>> I should say this is during the last reduction step that occures in this >>>>> query >>>>> >>>>> Josh >>>>> >>>>> On Dec 10, 2008, at 1:06 AM, Josh Ferguson wrote: >>>>> >>>>> This Query >>>>>> >>>>>> INSERT OVERWRITE TABLE aggregations PARTITION ( >>>>>> account='80c27664-b047-4c0a-86f3-342c0cdf36c7', >>>>>> application='myproduct', >>>>>> dataset='purchase', hour=341165, aggregation='count', >>>>>> aggregated_by='all' ) >>>>>> SELECT COUNT(actor_id) AS value FROM activities WHERE ( >>>>>> account='80c27664-b047-4c0a-86f3-342c0cdf36c7' AND >>>>>> application='myproduct' >>>>>> AND >>>>>> dataset='purchase' AND hour=341165 ) >>>>>> >>>>>> Generates this message >>>>>> >>>>>> java.lang.RuntimeException: Error while closing operators: >>>>>> org.apache.hadoop.hive.ql.metadata.HiveException: >>>>>> org.apache.hadoop.hive.serde2.SerDeException: >>>>>> org.apache.hadoop.hive.serde2.SerDeException: Trying to serialize 0 >>>>>> fields >>>>>> into a struct with 1 >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:202) >>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:440) >>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:155) >>>>>> >>>>>> >>>>>> Josh Ferguson >>>>>> >>>>> >>>>> >>>> >>> >> > -- Yours, Zheng
