[jira] [Commented] (HIVE-8744) hbase_stats3.q test fails when paths stored at JDBCStatsUtils.getIdColumnName() are too large

2014-11-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199451#comment-14199451
 ] 

Sergio Peña commented on HIVE-8744:
---

Thanks [~szehon]. 
I submit the correct patch.

 hbase_stats3.q test fails when paths stored at 
 JDBCStatsUtils.getIdColumnName() are too large
 -

 Key: HIVE-8744
 URL: https://issues.apache.org/jira/browse/HIVE-8744
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8744.1.patch


 This test is related to the bug HIVE-8065 where I am trying to support HDFS 
 encryption. One of the enhancements to support it is to create a 
 .hive-staging directory on the same table directory location where the query 
 is executed.
 Now, when running the hbase_stats3.q test from a temporary directory that has 
 a large path, then the new path, a combination of table location + 
 .hive-staging + random temporary subdirectories, is too large to fit into the 
 statistics table, so the path is truncated.
 This causes the following error:
 {noformat}
 2014-11-04 08:57:36,680 ERROR [LocalJobRunner Map Task Executor #0]: 
 jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:publishStat(199)) - Error 
 during publishing statistics. 
 java.sql.SQLDataException: A truncation error was encountered trying to 
 shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:148)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2667)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1031)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:870)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.sql.SQLException: A truncation error was encountered trying 
 to shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
  Source)
   ... 30 more
 Caused by: ERROR 22001: A truncation error was encountered trying to shrink 
 VARCHAR 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to 
 length 255.
   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
 Source

[jira] [Updated] (HIVE-8744) hbase_stats3.q test fails when paths stored at JDBCStatsUtils.getIdColumnName() are too large

2014-11-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8744:
--
Status: Open  (was: Patch Available)

 hbase_stats3.q test fails when paths stored at 
 JDBCStatsUtils.getIdColumnName() are too large
 -

 Key: HIVE-8744
 URL: https://issues.apache.org/jira/browse/HIVE-8744
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8744.1.patch


 This test is related to the bug HIVE-8065 where I am trying to support HDFS 
 encryption. One of the enhancements to support it is to create a 
 .hive-staging directory on the same table directory location where the query 
 is executed.
 Now, when running the hbase_stats3.q test from a temporary directory that has 
 a large path, then the new path, a combination of table location + 
 .hive-staging + random temporary subdirectories, is too large to fit into the 
 statistics table, so the path is truncated.
 This causes the following error:
 {noformat}
 2014-11-04 08:57:36,680 ERROR [LocalJobRunner Map Task Executor #0]: 
 jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:publishStat(199)) - Error 
 during publishing statistics. 
 java.sql.SQLDataException: A truncation error was encountered trying to 
 shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:148)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2667)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1031)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:870)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.sql.SQLException: A truncation error was encountered trying 
 to shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
  Source)
   ... 30 more
 Caused by: ERROR 22001: A truncation error was encountered trying to shrink 
 VARCHAR 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to 
 length 255.
   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
 Source)
   at org.apache.derby.iapi.types.SQLChar.hasNonBlankChars(Unknown Source

[jira] [Updated] (HIVE-8744) hbase_stats3.q test fails when paths stored at JDBCStatsUtils.getIdColumnName() are too large

2014-11-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8744:
--
Attachment: HIVE-8744.2.patch

 hbase_stats3.q test fails when paths stored at 
 JDBCStatsUtils.getIdColumnName() are too large
 -

 Key: HIVE-8744
 URL: https://issues.apache.org/jira/browse/HIVE-8744
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8744.1.patch, HIVE-8744.2.patch


 This test is related to the bug HIVE-8065 where I am trying to support HDFS 
 encryption. One of the enhancements to support it is to create a 
 .hive-staging directory on the same table directory location where the query 
 is executed.
 Now, when running the hbase_stats3.q test from a temporary directory that has 
 a large path, then the new path, a combination of table location + 
 .hive-staging + random temporary subdirectories, is too large to fit into the 
 statistics table, so the path is truncated.
 This causes the following error:
 {noformat}
 2014-11-04 08:57:36,680 ERROR [LocalJobRunner Map Task Executor #0]: 
 jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:publishStat(199)) - Error 
 during publishing statistics. 
 java.sql.SQLDataException: A truncation error was encountered trying to 
 shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:148)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2667)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1031)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:870)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.sql.SQLException: A truncation error was encountered trying 
 to shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
  Source)
   ... 30 more
 Caused by: ERROR 22001: A truncation error was encountered trying to shrink 
 VARCHAR 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to 
 length 255.
   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
 Source)
   at org.apache.derby.iapi.types.SQLChar.hasNonBlankChars(Unknown Source

[jira] [Updated] (HIVE-8744) hbase_stats3.q test fails when paths stored at JDBCStatsUtils.getIdColumnName() are too large

2014-11-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8744:
--
Status: Patch Available  (was: Open)

Submitted new patch that changes the stats table name to v3

 hbase_stats3.q test fails when paths stored at 
 JDBCStatsUtils.getIdColumnName() are too large
 -

 Key: HIVE-8744
 URL: https://issues.apache.org/jira/browse/HIVE-8744
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8744.1.patch, HIVE-8744.2.patch


 This test is related to the bug HIVE-8065 where I am trying to support HDFS 
 encryption. One of the enhancements to support it is to create a 
 .hive-staging directory on the same table directory location where the query 
 is executed.
 Now, when running the hbase_stats3.q test from a temporary directory that has 
 a large path, then the new path, a combination of table location + 
 .hive-staging + random temporary subdirectories, is too large to fit into the 
 statistics table, so the path is truncated.
 This causes the following error:
 {noformat}
 2014-11-04 08:57:36,680 ERROR [LocalJobRunner Map Task Executor #0]: 
 jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:publishStat(199)) - Error 
 during publishing statistics. 
 java.sql.SQLDataException: A truncation error was encountered trying to 
 shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:148)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2667)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1031)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:870)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.sql.SQLException: A truncation error was encountered trying 
 to shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
  Source)
   ... 30 more
 Caused by: ERROR 22001: A truncation error was encountered trying to shrink 
 VARCHAR 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to 
 length 255.
   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
 Source

[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive

2014-11-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200462#comment-14200462
 ] 

Sergio Peña commented on HIVE-8065:
---

HI [~Ferd]

Thanks for trying to help. There is already basic work done for this issue in 
local branch for hive 0.13. 
I will apply the patch for trunk and commit the changes to the HIVE-8065 branch.

What we don't have yet are the unit  query tests. Would you like to take that 
task?

 Support HDFS encryption functionality on Hive
 -

 Key: HIVE-8065
 URL: https://issues.apache.org/jira/browse/HIVE-8065
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Sergio Peña
Assignee: Sergio Peña

 The new encryption support on HDFS makes Hive incompatible and unusable when 
 this feature is used.
 HDFS encryption is designed so that an user can configure different 
 encryption zones (or directories) for multi-tenant environments. An 
 encryption zone has an exclusive encryption key, such as AES-128 or AES-256. 
 Because of security compliance, the HDFS does not allow to move/rename files 
 between encryption zones. Renames are allowed only inside the same encryption 
 zone. A copy is allowed between encryption zones.
 See HDFS-6134 for more details about HDFS encryption design.
 Hive currently uses a scratch directory (like /tmp/$user/$random). This 
 scratch directory is used for the output of intermediate data (between MR 
 jobs) and for the final output of the hive query which is later moved to the 
 table directory location.
 If Hive tables are in different encryption zones than the scratch directory, 
 then Hive won't be able to renames those files/directories, and it will make 
 Hive unusable.
 To handle this problem, we can change the scratch directory of the 
 query/statement to be inside the same encryption zone of the table directory 
 location. This way, the renaming process will be successful. 
 Also, for statements that move files between encryption zones (i.e. LOAD 
 DATA), a copy may be executed instead of a rename. This will cause an 
 overhead when copying large data files, but it won't break the encryption on 
 Hive.
 Another security thing to consider is when using joins selects. If Hive joins 
 different tables with different encryption key strengths, then the results of 
 the select might break the security compliance of the tables. Let's say two 
 tables with 128 bits and 256 bits encryption are joined, then the temporary 
 results might be stored in the 128 bits encryption zone. This will conflict 
 with the table encrypted with 256 bits temporary.
 To fix this, Hive should be able to select the scratch directory that is more 
 secured/encrypted in order to save the intermediate data temporary with no 
 compliance issues.
 For instance:
 {noformat}
 SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id;
 {noformat}
 - This should use a scratch directory (or staging directory) inside the 
 table-aes256 table location.
 {noformat}
 INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1;
 {noformat}
 - This should use a scratch directory inside the table-aes1 location.
 {noformat}
 FROM table-unencrypted
 INSERT OVERWRITE TABLE table-aes128 SELECT id, name
 INSERT OVERWRITE TABLE table-aes256 SELECT id, name
 {noformat}
 - This should use a scratch directory on each of the tables locations.
 - The first SELECT will have its scratch directory on table-aes128 directory.
 - The second SELECT will have its scratch directory on table-aes256 directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8744) hbase_stats3.q test fails when paths stored at JDBCStatsUtils.getIdColumnName() are too large

2014-11-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200933#comment-14200933
 ] 

Sergio Peña commented on HIVE-8744:
---

That patch works well [~prasanth_j].

We can use the one from HIVE-8735 instead.

 hbase_stats3.q test fails when paths stored at 
 JDBCStatsUtils.getIdColumnName() are too large
 -

 Key: HIVE-8744
 URL: https://issues.apache.org/jira/browse/HIVE-8744
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8744.1.patch, HIVE-8744.2.patch


 This test is related to the bug HIVE-8065 where I am trying to support HDFS 
 encryption. One of the enhancements to support it is to create a 
 .hive-staging directory on the same table directory location where the query 
 is executed.
 Now, when running the hbase_stats3.q test from a temporary directory that has 
 a large path, then the new path, a combination of table location + 
 .hive-staging + random temporary subdirectories, is too large to fit into the 
 statistics table, so the path is truncated.
 This causes the following error:
 {noformat}
 2014-11-04 08:57:36,680 ERROR [LocalJobRunner Map Task Executor #0]: 
 jdbc.JDBCStatsPublisher (JDBCStatsPublisher.java:publishStat(199)) - Error 
 during publishing statistics. 
 java.sql.SQLDataException: A truncation error was encountered trying to 
 shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:148)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2667)
   at 
 org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1031)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:870)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:579)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:591)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.sql.SQLException: A truncation error was encountered trying 
 to shrink VARCHAR 
 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to length 255.
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
  Source)
   ... 30 more
 Caused by: ERROR 22001: A truncation error was encountered trying to shrink 
 VARCHAR 'pfile:/home/hiveptest/hive-ptest-cloudera-slaves-ee9-24.vpc.' to 
 length 255.
   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
 Source

[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.07.patch

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Sergey Shelukhin
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Assignee: Jesús Camacho Rodríguez  (was: Sergey Shelukhin)
  Status: In Progress  (was: Patch Available)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.12.0, 0.11.0, 0.10.0, 0.9.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8827) Remove SSLv2Hello from list of disabled protocols

2014-11-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8827:
--
Attachment: HIVE-8827.1.patch

 Remove SSLv2Hello from list of disabled protocols
 -

 Key: HIVE-8827
 URL: https://issues.apache.org/jira/browse/HIVE-8827
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8827.1.patch


 Turns out SSLv2Hello is not the same as SSLv2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8827) Remove SSLv2Hello from list of disabled protocols

2014-11-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8827:
--
Status: Patch Available  (was: Open)

 Remove SSLv2Hello from list of disabled protocols
 -

 Key: HIVE-8827
 URL: https://issues.apache.org/jira/browse/HIVE-8827
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8827.1.patch


 Turns out SSLv2Hello is not the same as SSLv2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8857) hive release has SNAPSHOT dependency, which is not on maven central

2014-11-13 Thread JIRA
André Kelpe created HIVE-8857:
-

 Summary: hive release has SNAPSHOT dependency, which is not on 
maven central
 Key: HIVE-8857
 URL: https://issues.apache.org/jira/browse/HIVE-8857
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: André Kelpe


I just tried building a project, which uses hive-exec as a dependency and it 
bails out, since hive 0.14.0 introduced a SNAPSHOT dependency to apache 
calcite, which is not on maven central. Do we have to include another 
repository now? Next to that it also seems problematic to rely on a SNAPSHOT 
dependency, which can change any time.

{code}
:compileJava
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-exec/0.14.0/hive-exec-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive/0.14.0/hive-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-ant/0.14.0/hive-ant-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-metastore/0.14.0/hive-metastore-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-shims/0.14.0/hive-shims-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/fusesource/jansi/jansi/1.11/jansi-1.11.pom
Download 
http://repo1.maven.org/maven2/org/fusesource/jansi/jansi-project/1.11/jansi-project-1.11.pom
Download 
http://repo1.maven.org/maven2/org/fusesource/fusesource-pom/1.8/fusesource-pom-1.8.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-serde/0.14.0/hive-serde-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-common/0.14.0/hive-shims-common-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-common-secure/0.14.0/hive-shims-common-secure-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.20/0.14.0/hive-shims-0.20-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.20S/0.14.0/hive-shims-0.20S-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.23/0.14.0/hive-shims-0.23-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/hive/hive-common/0.14.0/hive-common-0.14.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/curator/curator-framework/2.6.0/curator-framework-2.6.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/curator/apache-curator/2.6.0/apache-curator-2.6.0.pom
Download 
http://repo1.maven.org/maven2/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.pom
Download 
http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.6/slf4j-api-1.7.6.pom
Download 
http://repo1.maven.org/maven2/org/slf4j/slf4j-parent/1.7.6/slf4j-parent-1.7.6.pom

FAILURE: Build failed with an exception.

* What went wrong:
Could not resolve all dependencies for configuration ':provided'.
 Could not find org.apache.calcite:calcite-core:0.9.2-incubating-SNAPSHOT.
  Required by:
  cascading:cascading-hive:1.1.0-wip-dev  org.apache.hive:hive-exec:0.14.0
 Could not find org.apache.calcite:calcite-avatica:0.9.2-incubating-SNAPSHOT.
  Required by:
  cascading:cascading-hive:1.1.0-wip-dev  org.apache.hive:hive-exec:0.14.0

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 16.956 secs
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8862) Fix ordering diferences on TestParse tests due to Java8

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8862:
--
Status: Patch Available  (was: Open)

 Fix ordering diferences on TestParse tests due to Java8
 ---

 Key: HIVE-8862
 URL: https://issues.apache.org/jira/browse/HIVE-8862
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8862.1.patch


 This bug is related to HIVE-8607. All TestParse tests are failing on Java8 
 due to XML serialization incomptabilities with JDK7.
 This serialization issues are just ordering differences with the XML files
 generated with JDK7 because the hash function for HashMap/HashSet. In order 
 to fix this, we should use LinkedHashMap/LinkedHashSet instead, so we can get 
 the correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8862) Fix ordering diferences on TestParse tests due to Java8

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8862:
--
Attachment: HIVE-8862.1.patch

This patch replace HashMap/HashSet by LinkedHashMap/LinkedHashSet in all the 
places where its values will be serialized in TestParse tests.

It also has a fix on QTestUtil.java to fix some incompatibilities between JDK7 
and JDK8. There was a need to re-generate all .q.xml files because of this 
changes.

 Fix ordering diferences on TestParse tests due to Java8
 ---

 Key: HIVE-8862
 URL: https://issues.apache.org/jira/browse/HIVE-8862
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8862.1.patch


 This bug is related to HIVE-8607. All TestParse tests are failing on Java8 
 due to XML serialization incomptabilities with JDK7.
 This serialization issues are just ordering differences with the XML files
 generated with JDK7 because the hash function for HashMap/HashSet. In order 
 to fix this, we should use LinkedHashMap/LinkedHashSet instead, so we can get 
 the correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8749) Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8749 started by Sergio Peña.
-
 Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT
 --

 Key: HIVE-8749
 URL: https://issues.apache.org/jira/browse/HIVE-8749
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14211466#comment-14211466
 ] 

Sergio Peña commented on HIVE-8359:
---

I'll take a look at the code. 

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8749) Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8749:
--
Attachment: HIVE-8749.1.patch

 Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT
 --

 Key: HIVE-8749
 URL: https://issues.apache.org/jira/browse/HIVE-8749
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
 Attachments: HIVE-8749.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8749) Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8749:
--
Status: Patch Available  (was: In Progress)

 Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT
 --

 Key: HIVE-8749
 URL: https://issues.apache.org/jira/browse/HIVE-8749
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
 Attachments: HIVE-8749.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8750) Commit initial encryption work

2014-11-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8750 started by Sergio Peña.
-
 Commit initial encryption work
 --

 Key: HIVE-8750
 URL: https://issues.apache.org/jira/browse/HIVE-8750
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña

 I believe Sergio has some work done for encryption. In this item we'll commit 
 it to branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-14 Thread JIRA
Jesús Camacho Rodríguez created HIVE-8869:
-

 Summary: RowSchema not updated for some ops when columns are pruned
 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez


When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
behavior is not consistent among operators: some will update their RowSchema, 
while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8869 started by Jesús Camacho Rodríguez.
-
 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez

 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8862) Fix ordering diferences on TestParse tests due to Java8

2014-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8862:
--
Status: Open  (was: Patch Available)

 Fix ordering diferences on TestParse tests due to Java8
 ---

 Key: HIVE-8862
 URL: https://issues.apache.org/jira/browse/HIVE-8862
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8862.1.patch, HIVE-8862.2.patch


 This bug is related to HIVE-8607. All TestParse tests are failing on Java8 
 due to XML serialization incomptabilities with JDK7.
 This serialization issues are just ordering differences with the XML files
 generated with JDK7 because the hash function for HashMap/HashSet. In order 
 to fix this, we should use LinkedHashMap/LinkedHashSet instead, so we can get 
 the correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8862) Fix ordering diferences on TestParse tests due to Java8

2014-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8862:
--
Status: Patch Available  (was: Open)

 Fix ordering diferences on TestParse tests due to Java8
 ---

 Key: HIVE-8862
 URL: https://issues.apache.org/jira/browse/HIVE-8862
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8862.1.patch, HIVE-8862.2.patch


 This bug is related to HIVE-8607. All TestParse tests are failing on Java8 
 due to XML serialization incomptabilities with JDK7.
 This serialization issues are just ordering differences with the XML files
 generated with JDK7 because the hash function for HashMap/HashSet. In order 
 to fix this, we should use LinkedHashMap/LinkedHashSet instead, so we can get 
 the correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Status: Patch Available  (was: In Progress)

The row schema of the lateral views and table scan operators are updated now 
after column pruning is applied; thus, the schema conforms to the tuples that 
go through the operator, which is important e.g. for HIVE-8435.

As a side effect, the statistics for the lateral views change, so I have 
uploaded the changes in those tests files.

[~ashutoshc], can you check it please?

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez

 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Attachment: HIVE-8869.patch

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Attachment: HIVE-8869.01.patch

When we update the schema of the TableScan, we need to check whether the 
prunecols list is null or empty.

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8869.01.patch, HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Attachment: HIVE-8869.01.patch

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8869.01.patch, HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Attachment: (was: HIVE-8869.01.patch)

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8869.01.patch, HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8896) expose (hadoop/tez) job ids in API

2014-11-17 Thread JIRA
André Kelpe created HIVE-8896:
-

 Summary: expose (hadoop/tez) job ids in API
 Key: HIVE-8896
 URL: https://issues.apache.org/jira/browse/HIVE-8896
 Project: Hive
  Issue Type: Improvement
  Components: Clients
Reporter: André Kelpe


In many cases it would be very useful to be able to map the hadoop/tez jobs 
back to a query that was executed/is currently being executed. Especially when 
hive queries are run within a bigger process the ability to get the job ids and 
query for counters is very beneficial to projects embeddding hive. 

I saw that cloudera's hue is parsing the logs produced by hive in order to get 
to the job ids. That seems rather brittle and can easily break, whenever the 
log format changes. Exposing the jobids in the API would make it a lot easier 
to build integrations like hue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8869:
--
Attachment: HIVE-8869.02.patch

It seems some readers (Accumulo, Vectorization) assume that the schema should 
contain all the columns of the tuples that are read, even if the columns are 
pruned... I changed the patch so they schema of the TableScan is not changed.

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8869.01.patch, HIVE-8869.02.patch, HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Open  (was: Patch Available)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Patch Available  (was: Open)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Attachment: HIVE-8359.4.patch

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8750) Commit initial encryption work

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8750:
--
Status: Patch Available  (was: In Progress)

 Commit initial encryption work
 --

 Key: HIVE-8750
 URL: https://issues.apache.org/jira/browse/HIVE-8750
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
 Attachments: HIVE-8750.1.patch


 I believe Sergio has some work done for encryption. In this item we'll commit 
 it to branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8750) Commit initial encryption work

2014-11-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8750:
--
Attachment: HIVE-8750.1.patch

 Commit initial encryption work
 --

 Key: HIVE-8750
 URL: https://issues.apache.org/jira/browse/HIVE-8750
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
 Attachments: HIVE-8750.1.patch


 I believe Sergio has some work done for encryption. In this item we'll commit 
 it to branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8869) RowSchema not updated for some ops when columns are pruned

2014-11-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215212#comment-14215212
 ] 

Jesús Camacho Rodríguez commented on HIVE-8869:
---

[~gopalv], the last version of the patch does not change anything for the 
schema of the TableScan operator, so nothing will break; it just takes into 
account column pruning wrt lateral views.

 RowSchema not updated for some ops when columns are pruned
 --

 Key: HIVE-8869
 URL: https://issues.apache.org/jira/browse/HIVE-8869
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8869.01.patch, HIVE-8869.02.patch, HIVE-8869.patch


 When columns are pruned in ColumnPrunerProcFactory, updating the row schema 
 behavior is not consistent among operators: some will update their RowSchema, 
 while some others will not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216336#comment-14216336
 ] 

Sergio Peña commented on HIVE-8359:
---

Thanks [~mickaellcr].

Sorry for the confusion. I did not see you uploaded another patch here. 
I just added two extra lines to the patch you uploaded. I will integrate your 
fixes there, and upload the patch again.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Open  (was: Patch Available)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Patch Available  (was: Open)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Attachment: HIVE-8359.5.patch

Attach new patch that integrates Mickael Lacour HIVE-6994 fix.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.08.patch

Starting over.

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.09.patch

This patch is the same than .08, but it contains the changes in the test 
results files.

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6914) parquet-hive cannot write nested map (map value is map)

2014-11-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218250#comment-14218250
 ] 

Sergio Peña commented on HIVE-6914:
---

Hi [~mickaellcr],

It sounds good if you use the patch from HIVE-8359 for this bug. Regarding 
adding the qtests to HIVE-8909, I think that ticket is meant to fix the reading 
part of different nested types formats generated by Thrift and Avro tools (it 
does not touch the writing part); so I think it should be good to have these 
writing tests separated from the reading tests.



 parquet-hive cannot write nested map (map value is map)
 ---

 Key: HIVE-6914
 URL: https://issues.apache.org/jira/browse/HIVE-6914
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.12.0, 0.13.0
Reporter: Tongjie Chen
  Labels: parquet, serialization
 Attachments: HIVE-6914.1.patch, HIVE-6914.2.patch


 // table schema (identical for both plain text version and parquet version)
 desc hive desc text_mmap;
 m map
 // sample nested map entry
 {level1:{level2_key1:value1,level2_key2:value2}}
 The following query will fail, 
 insert overwrite table parquet_mmap select * from text_mmap;
 Caused by: parquet.io.ParquetEncodingException: This should be an 
 ArrayWritable or MapWritable: 
 org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@f2f8106
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:85)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeArray(DataWritableWriter.java:118)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:80)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:82)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:55)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
 at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:115)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:77)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:90)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
 ... 9 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218252#comment-14218252
 ] 

Sergio Peña commented on HIVE-8909:
---

[~rdblue], is this ticket related to the different nested types found on this 
document?
https://github.com/rdblue/incubator-parquet-format/blob/PARQUET-113-add-list-and-map-spec/LogicalTypes.md

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8745) Joins on decimal keys return different results whether they are run as reduce join or map join

2014-11-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218342#comment-14218342
 ] 

Sergio Peña commented on HIVE-8745:
---

[~leftylev] 

I believed you added a statement on documentation for HIVE-7373 fix; but this 
patch is reverting the trailing zeroes fix. So you might wanna revert that 
document statement as well.

 Joins on decimal keys return different results whether they are run as reduce 
 join or map join
 --

 Key: HIVE-8745
 URL: https://issues.apache.org/jira/browse/HIVE-8745
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.14.0
Reporter: Gunther Hagleitner
Assignee: Jason Dere
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8745.1.patch, HIVE-8745.2.patch, HIVE-8745.3.patch, 
 join_test.q


 See attached .q file to reproduce. The difference seems to be whether 
 trailing 0s are considered the same value or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8918) Beeline terminal cannot be initialized due to jline2 change

2014-11-19 Thread JIRA
Sergio Peña created HIVE-8918:
-

 Summary: Beeline terminal cannot be initialized due to jline2 
change
 Key: HIVE-8918
 URL: https://issues.apache.org/jira/browse/HIVE-8918
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña


I fetched the latest changes from trunk, and I got the following error when 
attempting to execute beeline:

{noformat}
ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but 
interface was expected
at jline.TerminalFactory.create(TerminalFactory.java:101)
at jline.TerminalFactory.get(TerminalFactory.java:158)
at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:73)
at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Exception in thread main java.lang.IncompatibleClassChangeError: Found class 
jline.Terminal, but interface was expected
at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:101)
at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

I executed the following command:
{noformat}
hive --service beeline -u jdbc:hive2://localhost:1 -n sergio
{noformat}

The commit before the jline2 is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8918) Beeline terminal cannot be initialized due to jline2 change

2014-11-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218574#comment-14218574
 ] 

Sergio Peña commented on HIVE-8918:
---

FYI [~Ferd]. You worked on moving jline2, so you might have some ideas about 
what is happening.

 Beeline terminal cannot be initialized due to jline2 change
 ---

 Key: HIVE-8918
 URL: https://issues.apache.org/jira/browse/HIVE-8918
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña

 I fetched the latest changes from trunk, and I got the following error when 
 attempting to execute beeline:
 {noformat}
 ERROR] Terminal initialization failed; falling back to unsupported
 java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but 
 interface was expected
   at jline.TerminalFactory.create(TerminalFactory.java:101)
   at jline.TerminalFactory.get(TerminalFactory.java:158)
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:73)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 class jline.Terminal, but interface was expected
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:101)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 I executed the following command:
 {noformat}
 hive --service beeline -u jdbc:hive2://localhost:1 -n sergio
 {noformat}
 The commit before the jline2 is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8919) Fix FileUtils.copy() method to call distcp only for HDFS files (not local files)

2014-11-19 Thread JIRA
Sergio Peña created HIVE-8919:
-

 Summary: Fix FileUtils.copy() method to call distcp only for HDFS 
files (not local files)
 Key: HIVE-8919
 URL: https://issues.apache.org/jira/browse/HIVE-8919
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña


When loading a big file ( 32Mb) from the local filesystem to the HDFS 
filesystem, Hive fails because the local filesystem cannot load the 'distcp' 
class.

The 'distcp' class is used only by HDFS filesystem.

We should use distcp only when copying files between the HDFS filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8919) Fix FileUtils.copy() method to call distcp only for HDFS files (not local files)

2014-11-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8919 started by Sergio Peña.
-
 Fix FileUtils.copy() method to call distcp only for HDFS files (not local 
 files)
 

 Key: HIVE-8919
 URL: https://issues.apache.org/jira/browse/HIVE-8919
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña

 When loading a big file ( 32Mb) from the local filesystem to the HDFS 
 filesystem, Hive fails because the local filesystem cannot load the 'distcp' 
 class.
 The 'distcp' class is used only by HDFS filesystem.
 We should use distcp only when copying files between the HDFS filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8919) Fix FileUtils.copy() method to call distcp only for HDFS files (not local files)

2014-11-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8919:
--
Status: Patch Available  (was: In Progress)

 Fix FileUtils.copy() method to call distcp only for HDFS files (not local 
 files)
 

 Key: HIVE-8919
 URL: https://issues.apache.org/jira/browse/HIVE-8919
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: CDH-23392.1.patch


 When loading a big file ( 32Mb) from the local filesystem to the HDFS 
 filesystem, Hive fails because the local filesystem cannot load the 'distcp' 
 class.
 The 'distcp' class is used only by HDFS filesystem.
 We should use distcp only when copying files between the HDFS filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8919) Fix FileUtils.copy() method to call distcp only for HDFS files (not local files)

2014-11-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8919:
--
Attachment: CDH-23392.1.patch

 Fix FileUtils.copy() method to call distcp only for HDFS files (not local 
 files)
 

 Key: HIVE-8919
 URL: https://issues.apache.org/jira/browse/HIVE-8919
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: CDH-23392.1.patch


 When loading a big file ( 32Mb) from the local filesystem to the HDFS 
 filesystem, Hive fails because the local filesystem cannot load the 'distcp' 
 class.
 The 'distcp' class is used only by HDFS filesystem.
 We should use distcp only when copying files between the HDFS filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8918) Beeline terminal cannot be initialized due to jline2 change

2014-11-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña resolved HIVE-8918.
---
Resolution: Invalid

 Beeline terminal cannot be initialized due to jline2 change
 ---

 Key: HIVE-8918
 URL: https://issues.apache.org/jira/browse/HIVE-8918
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña

 I fetched the latest changes from trunk, and I got the following error when 
 attempting to execute beeline:
 {noformat}
 ERROR] Terminal initialization failed; falling back to unsupported
 java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but 
 interface was expected
   at jline.TerminalFactory.create(TerminalFactory.java:101)
   at jline.TerminalFactory.get(TerminalFactory.java:158)
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:73)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 class jline.Terminal, but interface was expected
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:101)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 I executed the following command:
 {noformat}
 hive --service beeline -u jdbc:hive2://localhost:1 -n sergio
 {noformat}
 The commit before the jline2 is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8918) Beeline terminal cannot be initialized due to jline2 change

2014-11-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218979#comment-14218979
 ] 

Sergio Peña commented on HIVE-8918:
---

Thanks [~Ferd].

That was the problem. I deleted the jline-0.9.94.jar found on the hadoop lib 
directory and it worked.

 Beeline terminal cannot be initialized due to jline2 change
 ---

 Key: HIVE-8918
 URL: https://issues.apache.org/jira/browse/HIVE-8918
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Sergio Peña

 I fetched the latest changes from trunk, and I got the following error when 
 attempting to execute beeline:
 {noformat}
 ERROR] Terminal initialization failed; falling back to unsupported
 java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but 
 interface was expected
   at jline.TerminalFactory.create(TerminalFactory.java:101)
   at jline.TerminalFactory.get(TerminalFactory.java:158)
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:73)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 class jline.Terminal, but interface was expected
   at org.apache.hive.beeline.BeeLineOpts.init(BeeLineOpts.java:101)
   at org.apache.hive.beeline.BeeLine.init(BeeLine.java:117)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:469)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 I executed the following command:
 {noformat}
 hive --service beeline -u jdbc:hive2://localhost:1 -n sergio
 {noformat}
 The commit before the jline2 is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez reopened HIVE-8435:
---

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, 
 HIVE-8435.10.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: (was: HIVE-8435.10.patch)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, 
 HIVE-8435.10.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.10.patch

I added an additional check to know whether although the SelectOp keeps all the 
columns in the inputs, it swaps them; in this case, it should not be removed 
from the plan.

Let's see how the tests go.

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, 
 HIVE-8435.10.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.10.patch

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, 
 HIVE-8435.10.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez resolved HIVE-8435.
---
Resolution: Fixed

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, 
 HIVE-8435.10.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: (was: HIVE-8435.10.patch)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-20 Thread JIRA
Jesús Camacho Rodríguez created HIVE-8926:
-

 Summary: Projections that only swap input columns are identified 
incorrectly as identity projections
 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0


Projection operators that only swap the input columns in the tuples are 
identified as identity projections, and thus incorrectly deleted from the plan 
by the _identity project removal_ optimization.

This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8926 started by Jesús Camacho Rodríguez.
-
 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Attachment: HIVE-8926.patch

[~ashutoshc], I added an additional check to know whether although the SelectOp 
keeps all the columns in the inputs, it swaps them; in this case, it should not 
be removed from the plan.

Let's see how the tests go.

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Status: Patch Available  (was: In Progress)

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-8870:
-

Assignee: Sergio Peña

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña

 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219815#comment-14219815
 ] 

Sergio Peña commented on HIVE-8870:
---

I found the issue is because ORC is case sensitive.

SELECT element.elementId FROM foobar_orc;  (fails)
SELECT element.elementid FROM foobar_orc;  (success)

I'll fix this by lowering the case on the query elements.

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña

 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Attachment: HIVE-8870.1.patch

This patch converts the query columns to lower case in order to search for the 
correct struct column.

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.1.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Status: Patch Available  (was: Open)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.1.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220018#comment-14220018
 ] 

Sergio Peña commented on HIVE-8909:
---

[~rdblue]

The are a few parquet query tests in the following path:
ql/src/test/queries/clientpositive/parquet_*.q
ql/src/test/results/clientpositive/parquet_*.q.out

The data files that are used by those queries tests are here (just read the *.q 
file to know which one is used):
data/files/*

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Attachment: HIVE-8926.01.patch

Now SelStarNoCompute is checked in the configuration object, as the field 
SelStarNoCompute in the operator is not initialized till initializeOp method is 
called. I also added an additional check on the length of colList.

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.01.patch, HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221230#comment-14221230
 ] 

Sergio Peña commented on HIVE-8909:
---

Hey [~rdblue]

Could you post the patch in the review board?

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch, 
 HIVE-8909.3.patch, parquet-test-data.tar.gz


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8435) Add identity project remover optimization

2014-11-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221276#comment-14221276
 ] 

Jesús Camacho Rodríguez commented on HIVE-8435:
---

[~sershe], sure, I'll help with HIVE-8395, it's not a problem, one way or 
another we knew there was going to be a lot of reviewing work. Just let me know 
where to start.

By the way, HIVE-8926 will introduce a few more changes in test results (~10 
tests).

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.09.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221386#comment-14221386
 ] 

Sergio Peña commented on HIVE-8909:
---

Looks good [~rdblue]. I run the tests locally and they're running correctly as 
well.
+1

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch, 
 HIVE-8909.3.patch, HIVE-8909.4.patch, parquet-test-data.tar.gz


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8945) Allow user to read encrypted read-only tables only if the scratch directory is encrypted

2014-11-21 Thread JIRA
Sergio Peña created HIVE-8945:
-

 Summary: Allow user to read encrypted read-only tables only if the 
scratch directory is encrypted
 Key: HIVE-8945
 URL: https://issues.apache.org/jira/browse/HIVE-8945
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña


With the changes for hdfs encryption, hive creates a staging directory inside 
table locations. If an user access a table as read-only access, then the 
staging directory is created in the old scratch directory 
(hive.exec.scratchdir). 

This does not work if the table to access is encrypted for security reasons. We 
don't want that encrypted data is written to an unencrypted zone. But, if the 
scratch directory is encrypted? Then we should allow the access.

This bug is to fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221428#comment-14221428
 ] 

Sergio Peña commented on HIVE-8909:
---

They didn't

{noformat}
mvn test -Phadoop-2 -Dtest=TestCliDriver 
-Dqfile=parquet_array_of_optional_elements.q,parquet_array_of_required_elements.q,parquet_array_of_single_field_struct.q,parquet_array_of_structs.q,parquet_array_of_unannotated_groups.q,parquet_array_of_unannotated_primitives.q,parquet_avro_array_of_primitives.q,parquet_avro_array_of_single_field_struct.q,parquet_nested_complex.q,parquet_thrift_array_of_primitives.q,parquet_thrift_array_of_single_field_struct.q

---
 T E S T S
---
Running org.apache.hadoop.hive.cli.TestCliDriver
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 80.949 sec - 
in org.apache.hadoop.hive.cli.TestCliDriver

Results :

Tests run: 12, Failures: 0, Errors: 0, Skipped: 0
{noformat}

I run the first two tests 
(parquet_array_null_element,parquet_array_of_multi_field_struct) manually 
before running the 12 tests, and they passed.

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch, 
 HIVE-8909.3.patch, HIVE-8909.4.patch, parquet-test-data.tar.gz


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8945) Allow user to read encrypted read-only tables only if the scratch directory is encrypted

2014-11-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8945:
--
Status: Patch Available  (was: Open)

 Allow user to read encrypted read-only tables only if the scratch directory 
 is encrypted
 

 Key: HIVE-8945
 URL: https://issues.apache.org/jira/browse/HIVE-8945
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8945.1.patch


 With the changes for hdfs encryption, hive creates a staging directory inside 
 table locations. If an user access a table as read-only access, then the 
 staging directory is created in the old scratch directory 
 (hive.exec.scratchdir). 
 This does not work if the table to access is encrypted for security reasons. 
 We don't want that encrypted data is written to an unencrypted zone. But, if 
 the scratch directory is encrypted? Then we should allow the access.
 This bug is to fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8945) Allow user to read encrypted read-only tables only if the scratch directory is encrypted

2014-11-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8945:
--
Attachment: HIVE-8945.1.patch

 Allow user to read encrypted read-only tables only if the scratch directory 
 is encrypted
 

 Key: HIVE-8945
 URL: https://issues.apache.org/jira/browse/HIVE-8945
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-8945.1.patch


 With the changes for hdfs encryption, hive creates a staging directory inside 
 table locations. If an user access a table as read-only access, then the 
 staging directory is created in the old scratch directory 
 (hive.exec.scratchdir). 
 This does not work if the table to access is encrypted for security reasons. 
 We don't want that encrypted data is written to an unencrypted zone. But, if 
 the scratch directory is encrypted? Then we should allow the access.
 This bug is to fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8909) Hive doesn't correctly read Parquet nested types

2014-11-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221540#comment-14221540
 ] 

Sergio Peña commented on HIVE-8909:
---

+1 again :)

 Hive doesn't correctly read Parquet nested types
 

 Key: HIVE-8909
 URL: https://issues.apache.org/jira/browse/HIVE-8909
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Ryan Blue
Assignee: Ryan Blue
 Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch, 
 HIVE-8909.3.patch, HIVE-8909.4.patch, HIVE-8909.5.patch, HIVE-8909.6.patch, 
 parquet-test-data.tar.gz


 Parquet's Avro and Thrift object models don't produce the same parquet type 
 representation for lists and maps that Hive does. In the Parquet community, 
 we've defined what should be written and backward-compatibility rules for 
 existing data written by parquet-avro and parquet-thrift in PARQUET-113. We 
 need to implement those rules in the Hive Converter classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Attachment: HIVE-8926.01.patch

Re-uploading for tests...

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.01.patch, HIVE-8926.01.patch, HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Attachment: (was: HIVE-8926.01.patch)

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.01.patch, HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Status: In Progress  (was: Patch Available)

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.01.patch, HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8926) Projections that only swap input columns are identified incorrectly as identity projections

2014-11-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8926:
--
Status: Patch Available  (was: In Progress)

 Projections that only swap input columns are identified incorrectly as 
 identity projections
 ---

 Key: HIVE-8926
 URL: https://issues.apache.org/jira/browse/HIVE-8926
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.15.0
Reporter: Jesús Camacho Rodríguez
Assignee: Jesús Camacho Rodríguez
 Fix For: 0.15.0

 Attachments: HIVE-8926.01.patch, HIVE-8926.patch


 Projection operators that only swap the input columns in the tuples are 
 identified as identity projections, and thus incorrectly deleted from the 
 plan by the _identity project removal_ optimization.
 This bug was introduced in HIVE-8435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8857) hive release has SNAPSHOT dependency, which is not on maven central

2014-11-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222926#comment-14222926
 ] 

André Kelpe commented on HIVE-8857:
---

There are two things here: Having a SNAPSHOT dep in a release is a problem to 
begin with, but having a hive release with broken dependencies on maven central 
is even more problematic. How am I as user supposed to know, that I need to add 
a random snapshot repository to get my hive working?

 hive release has SNAPSHOT dependency, which is not on maven central
 ---

 Key: HIVE-8857
 URL: https://issues.apache.org/jira/browse/HIVE-8857
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: André Kelpe

 I just tried building a project, which uses hive-exec as a dependency and it 
 bails out, since hive 0.14.0 introduced a SNAPSHOT dependency to apache 
 calcite, which is not on maven central. Do we have to include another 
 repository now? Next to that it also seems problematic to rely on a SNAPSHOT 
 dependency, which can change any time.
 {code}
 :compileJava
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-exec/0.14.0/hive-exec-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive/0.14.0/hive-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-ant/0.14.0/hive-ant-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-metastore/0.14.0/hive-metastore-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-shims/0.14.0/hive-shims-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/fusesource/jansi/jansi/1.11/jansi-1.11.pom
 Download 
 http://repo1.maven.org/maven2/org/fusesource/jansi/jansi-project/1.11/jansi-project-1.11.pom
 Download 
 http://repo1.maven.org/maven2/org/fusesource/fusesource-pom/1.8/fusesource-pom-1.8.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-serde/0.14.0/hive-serde-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-common/0.14.0/hive-shims-common-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-common-secure/0.14.0/hive-shims-common-secure-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.20/0.14.0/hive-shims-0.20-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.20S/0.14.0/hive-shims-0.20S-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/shims/hive-shims-0.23/0.14.0/hive-shims-0.23-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/hive/hive-common/0.14.0/hive-common-0.14.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/curator/curator-framework/2.6.0/curator-framework-2.6.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/curator/apache-curator/2.6.0/apache-curator-2.6.0.pom
 Download 
 http://repo1.maven.org/maven2/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.pom
 Download 
 http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.6/slf4j-api-1.7.6.pom
 Download 
 http://repo1.maven.org/maven2/org/slf4j/slf4j-parent/1.7.6/slf4j-parent-1.7.6.pom
 FAILURE: Build failed with an exception.
 * What went wrong:
 Could not resolve all dependencies for configuration ':provided'.
  Could not find org.apache.calcite:calcite-core:0.9.2-incubating-SNAPSHOT.
   Required by:
   cascading:cascading-hive:1.1.0-wip-dev  
 org.apache.hive:hive-exec:0.14.0
  Could not find org.apache.calcite:calcite-avatica:0.9.2-incubating-SNAPSHOT.
   Required by:
   cascading:cascading-hive:1.1.0-wip-dev  
 org.apache.hive:hive-exec:0.14.0
 * Try:
 Run with --stacktrace option to get the stack trace. Run with --info or 
 --debug option to get more log output.
 BUILD FAILED
 Total time: 16.956 secs
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Status: Open  (was: Patch Available)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.1.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Status: Patch Available  (was: Open)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.2.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Attachment: (was: HIVE-8870.1.patch)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.2.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Attachment: HIVE-8870.2.patch

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.2.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6914) parquet-hive cannot write nested map (map value is map)

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6914:
--
Attachment: HIVE-6914.4.patch

Here's the file with the .q.out updated

 parquet-hive cannot write nested map (map value is map)
 ---

 Key: HIVE-6914
 URL: https://issues.apache.org/jira/browse/HIVE-6914
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.12.0, 0.13.0
Reporter: Tongjie Chen
Assignee: Sergio Peña
  Labels: parquet, serialization
 Attachments: HIVE-6914.1.patch, HIVE-6914.1.patch, HIVE-6914.2.patch, 
 HIVE-6914.3.patch, HIVE-6914.4.patch, NestedMap.parquet


 // table schema (identical for both plain text version and parquet version)
 desc hive desc text_mmap;
 m map
 // sample nested map entry
 {level1:{level2_key1:value1,level2_key2:value2}}
 The following query will fail, 
 insert overwrite table parquet_mmap select * from text_mmap;
 Caused by: parquet.io.ParquetEncodingException: This should be an 
 ArrayWritable or MapWritable: 
 org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@f2f8106
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:85)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeArray(DataWritableWriter.java:118)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:80)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:82)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:55)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
 at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:115)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:77)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:90)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
 ... 9 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6914) parquet-hive cannot write nested map (map value is map)

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6914:
--
Status: Open  (was: Patch Available)

 parquet-hive cannot write nested map (map value is map)
 ---

 Key: HIVE-6914
 URL: https://issues.apache.org/jira/browse/HIVE-6914
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0, 0.12.0
Reporter: Tongjie Chen
Assignee: Sergio Peña
  Labels: parquet, serialization
 Attachments: HIVE-6914.1.patch, HIVE-6914.1.patch, HIVE-6914.2.patch, 
 HIVE-6914.3.patch, HIVE-6914.4.patch, NestedMap.parquet


 // table schema (identical for both plain text version and parquet version)
 desc hive desc text_mmap;
 m map
 // sample nested map entry
 {level1:{level2_key1:value1,level2_key2:value2}}
 The following query will fail, 
 insert overwrite table parquet_mmap select * from text_mmap;
 Caused by: parquet.io.ParquetEncodingException: This should be an 
 ArrayWritable or MapWritable: 
 org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@f2f8106
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:85)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeArray(DataWritableWriter.java:118)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:80)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:82)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:55)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
 at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:115)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:77)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:90)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
 ... 9 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6914) parquet-hive cannot write nested map (map value is map)

2014-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-6914:
--
Status: Patch Available  (was: Open)

 parquet-hive cannot write nested map (map value is map)
 ---

 Key: HIVE-6914
 URL: https://issues.apache.org/jira/browse/HIVE-6914
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0, 0.12.0
Reporter: Tongjie Chen
Assignee: Sergio Peña
  Labels: parquet, serialization
 Attachments: HIVE-6914.1.patch, HIVE-6914.1.patch, HIVE-6914.2.patch, 
 HIVE-6914.3.patch, HIVE-6914.4.patch, NestedMap.parquet


 // table schema (identical for both plain text version and parquet version)
 desc hive desc text_mmap;
 m map
 // sample nested map entry
 {level1:{level2_key1:value1,level2_key2:value2}}
 The following query will fail, 
 insert overwrite table parquet_mmap select * from text_mmap;
 Caused by: parquet.io.ParquetEncodingException: This should be an 
 ArrayWritable or MapWritable: 
 org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@f2f8106
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:85)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeArray(DataWritableWriter.java:118)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:80)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:82)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:55)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
 at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:115)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
 at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:77)
 at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:90)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
 ... 9 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8960) ParsingException in the WHERE statement with a Sub Query

2014-11-25 Thread JIRA
Rémy SAISSY created HIVE-8960:
-

 Summary: ParsingException in the WHERE statement with a Sub Query
 Key: HIVE-8960
 URL: https://issues.apache.org/jira/browse/HIVE-8960
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.13.0
 Environment: Secured HDP 2.1.3 with Hive 0.13.0
Reporter: Rémy SAISSY


Comparison with a Sub query in a WHERE statement does not work.
Given that id_chargement is an integer:

USE db1;
SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT b.id_chargement FROM tbl2 
b);

Returns the following parsing error:

Error: Error while compiling statement: FAILED: ParseException line 1:88 cannot 
recognize input near 'SELECT' 'b' '.' in expression specification 
(state=42000,code=4)
java.sql.SQLException: Error while compiling statement: FAILED: ParseException 
line 1:88 cannot recognize input near 'SELECT' 'b' '.' in expression 
specification
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
at org.apache.hive.beeline.Commands.execute(Commands.java:736)
at org.apache.hive.beeline.Commands.sql(Commands.java:657)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:804)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8960) ParsingException in the WHERE statement with a Sub Query

2014-11-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rémy SAISSY updated HIVE-8960:
--
Description: 
Comparison with a Sub query in a WHERE statement does not work.
Given that id_chargement is an integer:

USE db1;
SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT MAX(b.id_chargement) FROM 
tbl2 b);
or
SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT b.id_chargement FROM tbl2 
b LIMIT 1);

Both return the following parsing error:

Error: Error while compiling statement: FAILED: ParseException line 1:88 cannot 
recognize input near 'SELECT' 'b' '.' in expression specification 
(state=42000,code=4)
java.sql.SQLException: Error while compiling statement: FAILED: ParseException 
line 1:88 cannot recognize input near 'SELECT' 'b' '.' in expression 
specification
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
at org.apache.hive.beeline.Commands.execute(Commands.java:736)
at org.apache.hive.beeline.Commands.sql(Commands.java:657)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:804)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)


  was:
Comparison with a Sub query in a WHERE statement does not work.
Given that id_chargement is an integer:

USE db1;
SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT b.id_chargement FROM tbl2 
b);

Returns the following parsing error:

Error: Error while compiling statement: FAILED: ParseException line 1:88 cannot 
recognize input near 'SELECT' 'b' '.' in expression specification 
(state=42000,code=4)
java.sql.SQLException: Error while compiling statement: FAILED: ParseException 
line 1:88 cannot recognize input near 'SELECT' 'b' '.' in expression 
specification
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
at org.apache.hive.beeline.Commands.execute(Commands.java:736)
at org.apache.hive.beeline.Commands.sql(Commands.java:657)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:804)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



 ParsingException in the WHERE statement with a Sub Query
 

 Key: HIVE-8960
 URL: https://issues.apache.org/jira/browse/HIVE-8960
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.13.0
 Environment: Secured HDP 2.1.3 with Hive 0.13.0
Reporter: Rémy SAISSY

 Comparison with a Sub query in a WHERE statement does not work.
 Given that id_chargement is an integer:
 USE db1;
 SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT MAX(b.id_chargement) 
 FROM tbl2 b);
 or
 SELECT * FROM tbl1 a WHERE a.id_chargement  (SELECT b.id_chargement FROM 
 tbl2 b LIMIT 1);
 Both return the following parsing error:
 Error: Error while compiling statement: FAILED: ParseException line 1:88 
 cannot recognize input near 'SELECT' 'b' '.' in expression specification 
 (state=42000,code=4)
 java.sql.SQLException: Error while compiling statement: FAILED: 
 ParseException line 1:88 cannot recognize input near 'SELECT' 'b' '.' in 
 expression specification
 at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
 at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
 at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
 at org.apache.hive.beeline.Commands.execute(Commands.java:736)
 at org.apache.hive.beeline.Commands.sql

[jira] [Commented] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-11-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224751#comment-14224751
 ] 

Sergio Peña commented on HIVE-8870:
---

Hi [~hagleitn], 

I've heard you have experience with ORC. 
Could you give me a quick review on the attached patch? It is a trivial change.

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.2.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7300) When creating database by specifying location, .db is not created

2014-11-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225285#comment-14225285
 ] 

Sergio Peña commented on HIVE-7300:
---

Hi [~sourabhpotnis],

The LOCATION keyword specifies to Hive where the database you're creating will 
reside. In the example you wrote, you are specifying the location as 
/addh0010/hive/addh0011/warehouse, so all tables will be stored in that 
location.

If you want to create a different location for the database, then you will need 
to run this:
# hdfs dfs -mkdir /addh0010/hive/addh0011/warehouse/test_loc.db

On Hive:
# CREATE DATABASE test_loc LOCATION 
'/addh0010/hive/addh0011/warehouse/test_loc.db';

 When creating database by specifying location, .db is not created
 -

 Key: HIVE-7300
 URL: https://issues.apache.org/jira/browse/HIVE-7300
 Project: Hive
  Issue Type: Bug
Reporter: sourabh potnis
  Labels: .db, database, location

 When I create a database without specifying location:
 e.g. create database test;
 it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db
 But when I create database by specifying location:
 e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
 Database will be created but /addh0010/hive/addh0011/warehouse/test_loc.db 
 does not get created.
 So if user tries to create 2 tables with same name in two different databases 
 at same location. We are not sure if table is created.
 So when database is created with location, .db directory should be created 
 with that database name at that location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7300) When creating database by specifying location, .db is not created

2014-11-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña resolved HIVE-7300.
---
Resolution: Invalid

 When creating database by specifying location, .db is not created
 -

 Key: HIVE-7300
 URL: https://issues.apache.org/jira/browse/HIVE-7300
 Project: Hive
  Issue Type: Bug
Reporter: sourabh potnis
  Labels: .db, database, location

 When I create a database without specifying location:
 e.g. create database test;
 it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db
 But when I create database by specifying location:
 e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
 Database will be created but /addh0010/hive/addh0011/warehouse/test_loc.db 
 does not get created.
 So if user tries to create 2 tables with same name in two different databases 
 at same location. We are not sure if table is created.
 So when database is created with location, .db directory should be created 
 with that database name at that location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-8974) Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)

2014-11-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez reassigned HIVE-8974:
-

Assignee: Jesús Camacho Rodríguez  (was: Gunther Hagleitner)

 Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)
 

 Key: HIVE-8974
 URL: https://issues.apache.org/jira/browse/HIVE-8974
 Project: Hive
  Issue Type: Task
Reporter: Julian Hyde
Assignee: Jesús Camacho Rodríguez

 Calcite recently (after 0.9.2, before 1.0.0) re-organized its package 
 structure and renamed a lot of classes. CALCITE-296 has the details, 
 including a description of the before:after mapping.
 This task is to upgrade to the version of Calcite that has the renamed 
 packages. There is a 1.0.0-SNAPSHOT in Apache nexus.
 Calcite functionality has not changed significantly, so it should be 
 straightforward to rename. This task should be completed ASAP, before Calcite 
 moves on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8987) slf4j classpath collision when starting hive metastore

2014-11-27 Thread JIRA
André Kelpe created HIVE-8987:
-

 Summary: slf4j classpath collision when starting hive metastore
 Key: HIVE-8987
 URL: https://issues.apache.org/jira/browse/HIVE-8987
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment: Apache Hadoop 2.5.2 and Apache Hive 0.14
Reporter: André Kelpe


The latest release introduced a collision on the classpath. When I start the 
metatstore, I see an slf4j error:


{code}
apache-hive-0.14.0-bin/bin/hive --service metastore 
Starting Hive Metastore Server
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/hadoop-2.5.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/vagrant/apache-hive-0.14.0-bin/lib/hive-jdbc-0.14.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-12-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232140#comment-14232140
 ] 

Sergio Peña commented on HIVE-8870:
---

Hi [~owen.omalley],

Could you help me reviewing this simple patch? 
[~brocknoland] told me you have experience with ORC.

Thanks

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.2.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Attachment: (was: HIVE-8870.2.patch)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.3.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Status: Patch Available  (was: Open)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.3.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Status: Open  (was: Patch Available)

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.3.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8870) errors when selecting a struct field within an array from ORC based tables

2014-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8870:
--
Attachment: HIVE-8870.3.patch

 errors when selecting a struct field within an array from ORC based tables
 --

 Key: HIVE-8870
 URL: https://issues.apache.org/jira/browse/HIVE-8870
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0
 Environment: HDP 2.1 / HDP 2.2 (YARN, but no Tez)
Reporter: Michael Haeusler
Assignee: Sergio Peña
 Attachments: HIVE-8870.3.patch


 When using ORC as storage for a table, we get errors on selecting a struct 
 field within an array. These errors do not appear with default format.
 {code:sql}
 CREATE  TABLE `foobar_orc`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string)
 STORED AS ORC;
 {code}
 When selecting from this _empty_ table, we get a direct NPE within the Hive 
 CLI:
 {code:sql}
 SELECT
   elements.elementId
 FROM
   foobar_orc;
 -- FAILED: RuntimeException java.lang.NullPointerException
 {code}
 A more real-world query produces a RuntimeException / NullPointerException in 
 the mapper:
 {code:sql}
 SELECT
   uid,
   element.elementId
 FROM
   foobar_orc
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 [...]
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:61)
 [...]
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Both queries run fine on a non-orc table:
 {code:sql}
 CREATE  TABLE `foobar`(
   `uid` bigint,
   `elements` arraystructelementid:bigint,foo:structbar:string);  
 SELECT
   elements.elementId
 FROM
   foobar;
 -- OK
 -- Time taken: 0.225 seconds
 SELECT
   uid,
   element.elementId
 FROM
   foobar
 LATERAL VIEW
   EXPLODE(elements) e AS element;
 -- Total MapReduce CPU Time Spent: 1 seconds 920 msec
 -- OK
 -- Time taken: 25.905 seconds
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   10   >