[jira] [Commented] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-02 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917691#comment-13917691
 ] 

cyril liao commented on HIVE-4975:
--

Just return null is not enough.Think about change the name of a column , or 
change the order of the columns .

 Reading orc file throws exception after adding new column
 -

 Key: HIVE-4975
 URL: https://issues.apache.org/jira/browse/HIVE-4975
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0
 Environment: hive 0.11.0 hadoop 1.0.0
Reporter: cyril liao
Assignee: Kevin Wilfong
Priority: Critical
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-4975.1.patch.txt


 ORC file read failure after add table column.
 create a table which have three column .(a string,b string,c string).
 add a new column after c by executing ALTER TABLE table ADD COLUMNS (d 
 string).
 execute hiveql select d from table,the following exception goes:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row [Error getting row data with 
 exception java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row [Error getting row data with exception 
 java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 d
 

[jira] [Commented] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-06 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862873#comment-13862873
 ] 

cyril liao commented on HIVE-4996:
--

Every thing works well after i change connection pool from bonecp to dbcp . You 
can have a try.

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-06 Thread cyril liao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cyril liao updated HIVE-4996:
-

Attachment: hive-4996.path

change connection poll from bonecp to dbcp

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
 Attachments: hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-06 Thread cyril liao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cyril liao updated HIVE-4996:
-

 Tags: hive Metastore  (was: hive hiveserver)
Affects Version/s: 0.12.0
   Status: Patch Available  (was: Open)

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
 Attachments: hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-06 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863702#comment-13863702
 ] 

cyril liao commented on HIVE-4996:
--

bonecp leads to communication with db fails,this is why there are unbalanced 
transcations .

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
 Attachments: hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-06 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863708#comment-13863708
 ] 

cyril liao commented on HIVE-4996:
--

Maybe there are some wrong usage of bonecp in hive,but i am sure bonecp is in 
charge of this issue.

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
 Attachments: hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2013-12-05 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840921#comment-13840921
 ] 

cyril liao commented on HIVE-5235:
--

 in my environment, the problem appeared when a int type column contains  null 
values. it was solved after i change the null values to -1 as default.

 Infinite loop with ORC file and Hive 0.11
 -

 Key: HIVE-5235
 URL: https://issues.apache.org/jira/browse/HIVE-5235
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
 Environment: Gentoo linux with Hortonworks Hadoop 
 hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
Reporter: Iván de Prado
Priority: Blocker
 Attachments: gendata.py


 We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
 some kind of infinite loop. They keep working indefinitely when we set a huge 
 task expiry timeout. If we the expiry time to 600 second, the taks fail 
 because of not reporting progress, and finally, the Job fails. 
 That is not consistent, and some times between jobs executions the behavior 
 changes. It happen for different queries.
 We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
 that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
 always the same consistently. Everything points to some kind of infinite 
 loop. My guessing is that it has some relation to the ORC file. Maybe some 
 pointer is not right when writing generating some kind of infinite loop when 
 reading.  Or maybe there is a bug in the reading stage.
 More information below. The stack trace:
 {noformat} 
 main prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
java.lang.Thread.State: RUNNABLE
   at java.util.zip.Inflater.inflateBytes(Native Method)
   at java.util.zip.Inflater.inflate(Inflater.java:256)
   - locked 0xf42a6ca0 (a java.util.zip.ZStreamRef)
   at 
 org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
   at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
   at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
   - eliminated 0xe1459700 (a 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
   - locked 0xe1459700 (a 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 {noformat} 
 We have seen the same stack trace 

[jira] [Created] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true

2013-11-25 Thread cyril liao (JIRA)
cyril liao created HIVE-5888:


 Summary: group by after join operation product no result when  
hive.optimize.skewjoin = true 
 Key: HIVE-5888
 URL: https://issues.apache.org/jira/browse/HIVE-5888
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: cyril liao






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true

2013-11-25 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832362#comment-13832362
 ] 

cyril liao commented on HIVE-5888:
--

if hive.optimize.skewjoin is set to false,we got right result

 group by after join operation product no result when  hive.optimize.skewjoin 
 = true 
 

 Key: HIVE-5888
 URL: https://issues.apache.org/jira/browse/HIVE-5888
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: cyril liao





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5123) group by on a same key producing wrong result

2013-09-17 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770402#comment-13770402
 ] 

cyril liao commented on HIVE-5123:
--

SELECT p_day,
   count(*) 
   from
(SELECT p_day  ,uid,poid
 FROM t_app_bd_stat
 WHERE p_day =20130910
 AND p_day=20130917
 AND newuser =1
 GROUP BY p_day,uid,poid) tmp
GROUP BY p_day
ORDER BY p_day ASC

the result get a lot of lines with a same p_day value.

 group by on a same key producing wrong result
 -

 Key: HIVE-5123
 URL: https://issues.apache.org/jira/browse/HIVE-5123
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao

 grouping by on a same key twice will run a single mapreduce-job,producing 
 wrong result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5124) group by without map aggregation lead to mapreduce exception

2013-08-21 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747102#comment-13747102
 ] 

cyril liao commented on HIVE-5124:
--

The sql is :
SELECT channeled,
   max(VV)AS vv,
   max(FUV) AS FUV,
   max(PV) AS PV,
   max(UV) AS UV
FROM
(
SELECT  channeled,
sum(CASE WHEN TYPE = 1 THEN a else cast( 0 as bigint) END) AS VV,
sum(CASE WHEN TYPE = 1 THEN b else cast( 0 as bigint) END) AS FUV,
sum(CASE WHEN TYPE = 2 THEN a else cast (0 as bigint) END) AS PV,
sum(CASE WHEN TYPE = 2 THEN b else cast (0 as bigint) END) AS UV
 FROM
 (SELECT count(uid) AS a, count(DISTINCT uid) AS b, TYPE, channeled
  FROM
  (SELECT uid,
 channeled,
  TYPE
  FROM
  (SELECT uid,
  parse_url(url,'QUERY','channeled') as channeled,
  1 AS TYPE
   FROM t_html5_vv
   WHERE  p_day = ${idate}
   UNION ALL 
   SELECT uid,
 parse_url(url,'QUERY','channeled') as 
channeled,
2 AS TYPE
   FROM t_html5_pv
   WHERE  p_day = ${idate}
   )tmp
   where channeled is not null  and channeled  ''
) tmp2
 GROUP BY channeled,TYPE
)tmp3
GROUP BY channeled)tmp4
GROUP BY channeled

I want to get uv and fuv from different table,t_html5_vv and t_html5_pv ,and 
combine the result in one row. The default  hive.map.aggr argument in 
hive-site.xml is setted to true,and the sql goes prefect.But  the exception is 
thrown when i set hive.map.aggr= false.

 group by without map aggregation lead to mapreduce exception
 

 Key: HIVE-5124
 URL: https://issues.apache.org/jira/browse/HIVE-5124
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao
Assignee: Bing Li

 On my environment, the same query but diffent by seting hive.map.aggr with 
 true or flase,produce different result.
 With hive.map.aggr=false,tasktracker report the following exception:
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field value from [0:_col0, 
 1:_col1, 2:_col2, 3:_col3]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:82)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:299)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:62)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 

[jira] [Created] (HIVE-5123) group by on a same key producing wrong result

2013-08-20 Thread cyril liao (JIRA)
cyril liao created HIVE-5123:


 Summary: group by on a same key producing wrong result
 Key: HIVE-5123
 URL: https://issues.apache.org/jira/browse/HIVE-5123
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao


grouping by on a same key twice will run a single mapreduce-job,producing wrong 
result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5124) group by without map aggregation lead to mapreduce exception

2013-08-20 Thread cyril liao (JIRA)
cyril liao created HIVE-5124:


 Summary: group by without map aggregation lead to mapreduce 
exception
 Key: HIVE-5124
 URL: https://issues.apache.org/jira/browse/HIVE-5124
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao


On my environment, the same query but diffent by seting hive.map.aggr with true 
or flase,produce different result.
With hive.map.aggr=false,tasktracker report the following exception:
java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
... 14 more
Caused by: java.lang.RuntimeException: cannot find field value from [0:_col0, 
1:_col1, 2:_col2, 3:_col3]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:82)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:299)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:62)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:153)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4975) ORC file read failure after add table column

2013-08-01 Thread cyril liao (JIRA)
cyril liao created HIVE-4975:


 Summary: ORC file read failure after add table column
 Key: HIVE-4975
 URL: https://issues.apache.org/jira/browse/HIVE-4975
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0
 Environment: hive 0.11.0 hadoop 1.0.0
Reporter: cyril liao
Priority: Critical


ORC file read failure after add table column.
create a table which have three column .(a string,b string,c string).
add a new column after c by executing ALTER TABLE table ADD COLUMNS (d 
string).
execute hiveql select d from table,the following exception goes:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row [Error getting row data with exception 
java.lang.ArrayIndexOutOfBoundsException: 4
at 
org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
at 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
 ]
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row [Error getting row data with exception 
java.lang.ArrayIndexOutOfBoundsException: 4
at 
org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
at 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
 ]
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating d
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at 

[jira] [Updated] (HIVE-4975) Reading orc file throws exception after adding new column

2013-08-01 Thread cyril liao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cyril liao updated HIVE-4975:
-

   Tags: ORCfile
 Labels: orcfile  (was: )
Summary: Reading orc file throws exception after adding new column  (was: 
ORC file read failure after add table column)

 Reading orc file throws exception after adding new column
 -

 Key: HIVE-4975
 URL: https://issues.apache.org/jira/browse/HIVE-4975
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0
 Environment: hive 0.11.0 hadoop 1.0.0
Reporter: cyril liao
Priority: Critical
  Labels: orcfile

 ORC file read failure after add table column.
 create a table which have three column .(a string,b string,c string).
 add a new column after c by executing ALTER TABLE table ADD COLUMNS (d 
 string).
 execute hiveql select d from table,the following exception goes:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row [Error getting row data with 
 exception java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row [Error getting row data with exception 
 java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 d
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80)
   

[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions

2013-04-11 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629724#comment-13629724
 ] 

cyril liao commented on HIVE-2702:
--

Drop partition get org.apache.hadoop.hive.ql.parse.SemanticException because of 
this design.
Drop partition is a base function,why this design exist if a base function can 
not work!

 listPartitionsByFilter only supports string partitions
 --

 Key: HIVE-2702
 URL: https://issues.apache.org/jira/browse/HIVE-2702
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Aniket Mokashi
Assignee: Aniket Mokashi
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, 
 HIVE-2702.1.patch


 listPartitionsByFilter supports only non-string partitions. This is because 
 its explicitly specified in generateJDOFilterOverPartitions in 
 ExpressionTree.java. 
 //Can only support partitions whose types are string
   if( ! table.getPartitionKeys().get(partitionColumnIndex).
   
 getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
 throw new MetaException
 (Filtering is supported only on partition keys of type string);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3856) Authorization report NPE when table partition do not exits

2013-01-03 Thread cyril liao (JIRA)
cyril liao created HIVE-3856:


 Summary: Authorization report NPE when table partition do not exits
 Key: HIVE-3856
 URL: https://issues.apache.org/jira/browse/HIVE-3856
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.9.0
 Environment: hadoop 0.20.205 hive 0.9.0
Reporter: cyril liao


the following hql report npe:
use app;select a.name from( select profile['net'] as name from app.app_profile 
where p_day = 20130103 group by profile['net']) a left outer join 
app.app_network_mode b on a.name = b.name where b.name is null;

the errors are :
2013-01-04 11:10:05,905 ERROR ql.Driver (SessionState.java:printError(400)) - 
FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:625)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:486)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:917)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

If i change the partition condition from p_day = 20130103 to p_day = 
20121228 , it works. The p_day=20121228 partition ensure exits ,but the 
p_pay=20130103 partition do not exit.
The statement should not report NPE !

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3827) LATERAL VIEW doesn't work with union all statement

2012-12-25 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539450#comment-13539450
 ] 

cyril liao commented on HIVE-3827:
--

if i create a table named tmp_tbl as 
SELECT
1 as from_pid,
1 as to_pid,
cid as from_path,
(CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
0 as status
FROM
(SELECT union_map(c_map) AS c_map
FROM
(SELECT collect_map(id,parent_id)AS c_map
FROM
wl_channels
GROUP BY id,parent_id
)tmp
)tmp2
LATERAL VIEW recursion_concat(c_map) a AS cid, pid

and use the table tmp_tbl to select ,the result goes right.

at the same time ,i do the same works under hive 0.7.1 , the result goes right 
too.


 LATERAL VIEW doesn't work with union all statement
 --

 Key: HIVE-3827
 URL: https://issues.apache.org/jira/browse/HIVE-3827
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: hive0.9.0 hadoop 0.20.205
Reporter: cyril liao

 LATER VIEW lose data working with union all.
 query NO.1:
 SELECT
 1 as from_pid,
 1 as to_pid,
 cid as from_path,
 (CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
 0 as status
 FROM
 (SELECT union_map(c_map) AS c_map
 FROM
 (SELECT collect_map(id,parent_id)AS c_map
 FROM
 wl_channels
 GROUP BY id,parent_id
 )tmp
 )tmp2
 LATERAL VIEW recursion_concat(c_map) a AS cid, pid
 this query returns about 1 rows ,and their status is 0.
 query NO.2:
 select
 a.from_pid as from_pid,
 a.to_pid as to_pid, 
 a.from_path as from_path,
 a.to_path as to_path,
 a.status as status
 from wl_dc_channels a
 where a.status  0
 this query returns about 100 rows ,and their status is 1 or 2.
 query NO.3:
 select
 from_pid,
 to_pid,
 from_path,
 to_path,
 status
 from
 (
 SELECT
 1 as from_pid,
 1 as to_pid,
 cid as from_path,
 (CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
 0 as status
 FROM
 (SELECT union_map(c_map) AS c_map
 FROM
 (SELECT collect_map(id,parent_id)AS c_map
 FROM
 wl_channels
 GROUP BY id,parent_id
 )tmp
 )tmp2
 LATERAL VIEW recursion_concat(c_map) a AS cid, pid
 union all
 select
 a.from_pid as from_pid,
 a.to_pid as to_pid, 
 a.from_path as from_path,
 a.to_path as to_path,
 a.status as status
 from wl_dc_channels a
 where a.status  0
 ) unin_tbl
 this query has the same result as query NO.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3104) Predicate pushdown doesn't work with multi-insert statements using LATERAL VIEW

2012-12-20 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536934#comment-13536934
 ] 

cyril liao commented on HIVE-3104:
--

LATER VIEW doesn't work with UNION ALL too.
query NO.1:
 SELECT
 1 as from_pid,
 1 as to_pid,
 cid as from_path,
 (CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
 0 as status
FROM
(SELECT union_map(c_map) AS c_map
 FROM
 (SELECT collect_map(id,parent_id)AS c_map
  FROM
  wl_channels
  GROUP BY id,parent_id
  )tmp
)tmp2
LATERAL VIEW recursion_concat(c_map) a AS cid, pid

this query returns about 1 rows ,and there status is 0.

query NO.2:
 select
  a.from_pid as from_pid,
  a.to_pid as to_pid, 
  a.from_path as from_path,
  a.to_path as to_path,
  a.status as status
from wl_dc_channels a
  where a.status  0

this query returns about 100 rows ,and there status is 1 or 2.

query NO.3:
select
  from_pid,
  to_pid,
  from_path,
  to_path,
  status
 from
(
 SELECT
 1 as from_pid,
 1 as to_pid,
 cid as from_path,
 (CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
 0 as status
FROM
(SELECT union_map(c_map) AS c_map
 FROM
 (SELECT collect_map(id,parent_id)AS c_map
  FROM
  wl_channels
  GROUP BY id,parent_id
  )tmp
)tmp2
LATERAL VIEW recursion_concat(c_map) a AS cid, pid
union all
 select
  a.from_pid as from_pid,
  a.to_pid as to_pid, 
  a.from_path as from_path,
  a.to_path as to_path,
  a.status as status
from wl_dc_channels a
  where a.status  0
) unin_tbl

this query has the same result as query NO.2

 Predicate pushdown doesn't work with multi-insert statements using LATERAL 
 VIEW
 ---

 Key: HIVE-3104
 URL: https://issues.apache.org/jira/browse/HIVE-3104
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0
 Environment: Apache Hive 0.9.0, Apache Hadoop 0.20.205.0
Reporter: Mark Grover

 Predicate pushdown seems to work for single-insert queries using LATERAL 
 VIEW. It also seems to work for multi-insert queries *not* using LATERAL 
 VIEW. However, it doesn't work for multi-insert queries using LATERAL VIEW.
 Here are some examples. In the below examples, I make use of the fact that a 
 query with no partition filtering when run under hive.mapred.mode=strict 
 fails.
 --Table creation and population
 DROP TABLE IF EXISTS test;
 CREATE TABLE test (col1 arrayint, col2 int)  PARTITIONED BY (part_col int);
 INSERT OVERWRITE TABLE test PARTITION (part_col=1) SELECT array(1,2), 
 count(*) FROM test;
 INSERT OVERWRITE TABLE test PARTITION (part_col=2) SELECT array(2,4,6), 
 count(*) FROM test;
 -- Query 1
 -- This succeeds (using LATERAL VIEW with single insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2);
 -- Query 2
 -- This succeeds (NOT using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT col1
 WHERE (part_col=2);
 -- Query 3
 -- This fails (using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT exp_col1
 WHERE (part_col=2);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3104) Predicate pushdown doesn't work with multi-insert statements using LATERAL VIEW

2012-12-20 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537714#comment-13537714
 ] 

cyril liao commented on HIVE-3104:
--

ok

 Predicate pushdown doesn't work with multi-insert statements using LATERAL 
 VIEW
 ---

 Key: HIVE-3104
 URL: https://issues.apache.org/jira/browse/HIVE-3104
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0
 Environment: Apache Hive 0.9.0, Apache Hadoop 0.20.205.0
Reporter: Mark Grover

 Predicate pushdown seems to work for single-insert queries using LATERAL 
 VIEW. It also seems to work for multi-insert queries *not* using LATERAL 
 VIEW. However, it doesn't work for multi-insert queries using LATERAL VIEW.
 Here are some examples. In the below examples, I make use of the fact that a 
 query with no partition filtering when run under hive.mapred.mode=strict 
 fails.
 --Table creation and population
 DROP TABLE IF EXISTS test;
 CREATE TABLE test (col1 arrayint, col2 int)  PARTITIONED BY (part_col int);
 INSERT OVERWRITE TABLE test PARTITION (part_col=1) SELECT array(1,2), 
 count(*) FROM test;
 INSERT OVERWRITE TABLE test PARTITION (part_col=2) SELECT array(2,4,6), 
 count(*) FROM test;
 -- Query 1
 -- This succeeds (using LATERAL VIEW with single insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2);
 -- Query 2
 -- This succeeds (NOT using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT col1
 WHERE (part_col=2);
 -- Query 3
 -- This fails (using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT exp_col1
 WHERE (part_col=2);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3827) LATERAL VIEW doesn't work with union all statement

2012-12-20 Thread cyril liao (JIRA)
cyril liao created HIVE-3827:


 Summary: LATERAL VIEW doesn't work with union all statement
 Key: HIVE-3827
 URL: https://issues.apache.org/jira/browse/HIVE-3827
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: hive0.9.0 hadoop 0.20.205
Reporter: cyril liao


LATER VIEW lose data working with union all.


query NO.1:
SELECT
1 as from_pid,
1 as to_pid,
cid as from_path,
(CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
0 as status
FROM
(SELECT union_map(c_map) AS c_map
FROM
(SELECT collect_map(id,parent_id)AS c_map
FROM
wl_channels
GROUP BY id,parent_id
)tmp
)tmp2
LATERAL VIEW recursion_concat(c_map) a AS cid, pid
this query returns about 1 rows ,and their status is 0.

query NO.2:
select
a.from_pid as from_pid,
a.to_pid as to_pid, 
a.from_path as from_path,
a.to_path as to_path,
a.status as status
from wl_dc_channels a
where a.status  0
this query returns about 100 rows ,and their status is 1 or 2.

query NO.3:
select
from_pid,
to_pid,
from_path,
to_path,
status
from
(
SELECT
1 as from_pid,
1 as to_pid,
cid as from_path,
(CASE WHEN pid=0 THEN cid ELSE pid END) as to_path,
0 as status
FROM
(SELECT union_map(c_map) AS c_map
FROM
(SELECT collect_map(id,parent_id)AS c_map
FROM
wl_channels
GROUP BY id,parent_id
)tmp
)tmp2
LATERAL VIEW recursion_concat(c_map) a AS cid, pid
union all
select
a.from_pid as from_pid,
a.to_pid as to_pid, 
a.from_path as from_path,
a.to_path as to_path,
a.status as status
from wl_dc_channels a
where a.status  0
) unin_tbl
this query has the same result as query NO.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2011-09-01 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095738#comment-13095738
 ] 

cyril liao commented on HIVE-1545:
--

Neither in core.tar.gz nor ext.tar.gz,there is a class named 
com.facebook.hive.udf.lib.UDFUtils,which is used by many UDFs.
In package com.facebook.hive.udf.lib ,only Counter and SetOps are included.

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, udfs.tar.gz, udfs.tar.gz


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2011-08-31 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095106#comment-13095106
 ] 

cyril liao commented on HIVE-1545:
--

com.facebook.hive.udf.lib.UDFUtils is not included.

Would you please upload it?

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, udfs.tar.gz, udfs.tar.gz


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira