[jira] [Updated] (HIVE-5300) MapredLocalTask logs success message twice
[ https://issues.apache.org/jira/browse/HIVE-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5300: Status: Patch Available (was: Open) MapredLocalTask logs success message twice -- Key: HIVE-5300 URL: https://issues.apache.org/jira/browse/HIVE-5300 Project: Hive Issue Type: Improvement Components: Logging Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5300.1.patch.txt Something like this, {noformat} Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5300) MapredLocalTask logs success message twice
[ https://issues.apache.org/jira/browse/HIVE-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5300: Attachment: HIVE-5300.1.patch.txt MapredLocalTask logs success message twice -- Key: HIVE-5300 URL: https://issues.apache.org/jira/browse/HIVE-5300 Project: Hive Issue Type: Improvement Components: Logging Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5300.1.patch.txt Something like this, {noformat} Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5172) TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server
[ https://issues.apache.org/jira/browse/HIVE-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769239#comment-13769239 ] Ashutosh Chauhan commented on HIVE-5172: As I mentioned in https://issues.apache.org/jira/browse/HIVE-3805?focusedCommentId=13533106page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13533106 this TUGIContainingTransport is really a hack, current way to do this is to use {{Plain Sasl Server}} otherwise we keep running into problems like this. [~agateaaa] wondering if you would like to pursue the 'proper fix'? If not, than I need to think a bit for this current patch. Will get back to you soon. TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server - Key: HIVE-5172 URL: https://issues.apache.org/jira/browse/HIVE-5172 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0, 0.10.0, 0.11.0 Reporter: agate Attachments: HIVE-5172.1.patch.txt We are running into frequent problem using HCatalog 0.4.1 (Hive Metastore Server 0.9) where we get connection reset or connection timeout errors on the client and NullPointerException in TUGIBasedProcessor on the server. {code} hive client logs: = org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2136) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2122) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openStore(HiveMetaStoreClient.java:286) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:197) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:157) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:830) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:954) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7524) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 31 more {code} {code} hive metastore server logs: === 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message. java.lang.NullPointerException at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183) at
[jira] [Updated] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-3764: -- Attachment: (was: HIVE-3764.4.patch) Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-3764: -- Attachment: (was: HIVE-3764-0.13-addional-file.patch) Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 14120: HIVE-3764: Support metastore version consistency check
On Sept. 13, 2013, 1:35 p.m., Brock Noland wrote: Prasad, this looks really good! I just had two people email me directly yesterday and both were using the incorrect metastore version. Have you ran the new unit tests a couple of times? Have you done any other testing? Addressed the comments. The MetaException doesn't support nesting, so changed that to HiveMetaException. Added more tests. manually tested the init and upgrade operations using derby and MySQL. As discussed on the ticket, I am going to split the patch into two separate tickets. will close this review - Prasad --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14120/#review26079 --- On Sept. 13, 2013, 7:53 a.m., Prasad Mujumdar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14120/ --- (Updated Sept. 13, 2013, 7:53 a.m.) Review request for hive. Bugs: HIVE-3764 https://issues.apache.org/jira/browse/HIVE-3764 Repository: hive-git Description --- - Added a new table in the metastore schema to store the Hive version in the metastore. - Metastore handler compare the version stored in the schema with its own version. If there's a mismatch, then it can either record the correct version or raise error. The behavior is configurable via a new Hive config. This config when set, also restrict dataNucleus to auto upgrade the schema. - The new schema creation and upgrade scripts record the new version in the metastore version table. - Added 0.12 upgrade scripts for all supported DBs to creates the new table version tables in 0.12 metastore schema - Added a new schemaTool that can perform new schema initialization or upgrade based on the schema version and product version. Diffs - beeline/src/java/org/apache/hive/beeline/HiveSchemaHelper.java PRE-CREATION beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java PRE-CREATION beeline/src/test/org/apache/hive/beeline/src/test/TestSchemaTool.java PRE-CREATION bin/ext/schemaTool.sh PRE-CREATION bin/schematool PRE-CREATION build-common.xml ad5ac23 build.xml 3e87163 common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22149e4 conf/hive-default.xml.template 9a3fc1d metastore/scripts/upgrade/derby/014-HIVE-3764.derby.sql PRE-CREATION metastore/scripts/upgrade/derby/hive-schema-0.12.0.derby.sql cce544f metastore/scripts/upgrade/derby/upgrade-0.10.0-to-0.11.0.derby.sql cae7936 metastore/scripts/upgrade/derby/upgrade-0.11.0-to-0.12.0.derby.sql 492cc93 metastore/scripts/upgrade/derby/upgrade.order.derby PRE-CREATION metastore/scripts/upgrade/mysql/014-HIVE-3764.mysql.sql PRE-CREATION metastore/scripts/upgrade/mysql/hive-schema-0.12.0.mysql.sql 22a77fe metastore/scripts/upgrade/mysql/upgrade-0.11.0-to-0.12.0.mysql.sql 375a05f metastore/scripts/upgrade/mysql/upgrade.order.mysql PRE-CREATION metastore/scripts/upgrade/oracle/014-HIVE-3764.oracle.sql PRE-CREATION metastore/scripts/upgrade/oracle/hive-schema-0.12.0.oracle.sql 85a0178 metastore/scripts/upgrade/oracle/upgrade-0.10.0-to-0.11.0.mysql.sql PRE-CREATION metastore/scripts/upgrade/oracle/upgrade-0.11.0-to-0.12.0.oracle.sql a2d0901 metastore/scripts/upgrade/oracle/upgrade.order.oracle PRE-CREATION metastore/scripts/upgrade/postgres/014-HIVE-3764.postgres.sql PRE-CREATION metastore/scripts/upgrade/postgres/hive-schema-0.12.0.postgres.sql 7b319ba metastore/scripts/upgrade/postgres/upgrade-0.11.0-to-0.12.0.postgres.sql 9da0a1b metastore/scripts/upgrade/postgres/upgrade.order.postgres PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 39dda92 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java a08c728 metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java e410c3a metastore/src/model/org/apache/hadoop/hive/metastore/model/MVersionTable.java PRE-CREATION metastore/src/model/package.jdo c42b5b0 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java 8066784 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 0f9b16c metastore/src/test/org/apache/hadoop/hive/metastore/TestMetastoreVersion.java PRE-CREATION Diff: https://reviews.apache.org/r/14120/diff/ Testing --- Added new tests for schema verification and schemaTool. Thanks, Prasad Mujumdar
[jira] [Commented] (HIVE-5300) MapredLocalTask logs success message twice
[ https://issues.apache.org/jira/browse/HIVE-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769244#comment-13769244 ] Ashutosh Chauhan commented on HIVE-5300: +1 MapredLocalTask logs success message twice -- Key: HIVE-5300 URL: https://issues.apache.org/jira/browse/HIVE-5300 Project: Hive Issue Type: Improvement Components: Logging Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5300.1.patch.txt Something like this, {noformat} Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 14169: HIVE-3764: Support metastore version consistency check
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14169/ --- Review request for hive, Ashutosh Chauhan and Brock Noland. Bugs: HIVE-3764 https://issues.apache.org/jira/browse/HIVE-3764 Repository: hive-git Description --- This is a 0.12 specific patch. The trunk patch will include additional metastore scripts which I will attach separately to the ticket. - Added a new table in the metastore schema to store the Hive version in the metastore. - Metastore handler compare the version stored in the schema with its own version. If there's a mismatch, then it can either record the correct version or raise error. The behavior is configurable via a new Hive config. This config when set, also restrict dataNucleus to auto upgrade the schema. - The new schema creation and upgrade scripts record the new version in the metastore version table. - Added 0.12 upgrade scripts for all supported DBs to creates the new table version tables in 0.12 metastore schema The current patch has the verification turned off by default. I would prefer to keep it enabled, though it require any add-hoc setup to explicitly disable it (or create the metastore schema by running scripts). The default can be changed or left as is as per the consensus. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22149e4 conf/hive-default.xml.template 9a3fc1d metastore/scripts/upgrade/derby/014-HIVE-3764.derby.sql PRE-CREATION metastore/scripts/upgrade/derby/hive-schema-0.12.0.derby.sql cce544f metastore/scripts/upgrade/derby/upgrade-0.10.0-to-0.11.0.derby.sql cae7936 metastore/scripts/upgrade/derby/upgrade-0.11.0-to-0.12.0.derby.sql 492cc93 metastore/scripts/upgrade/derby/upgrade.order.derby PRE-CREATION metastore/scripts/upgrade/mysql/014-HIVE-3764.mysql.sql PRE-CREATION metastore/scripts/upgrade/mysql/hive-schema-0.12.0.mysql.sql 22a77fe metastore/scripts/upgrade/mysql/upgrade-0.11.0-to-0.12.0.mysql.sql 375a05f metastore/scripts/upgrade/mysql/upgrade.order.mysql PRE-CREATION metastore/scripts/upgrade/oracle/014-HIVE-3764.oracle.sql PRE-CREATION metastore/scripts/upgrade/oracle/hive-schema-0.12.0.oracle.sql 85a0178 metastore/scripts/upgrade/oracle/upgrade-0.10.0-to-0.11.0.mysql.sql PRE-CREATION metastore/scripts/upgrade/oracle/upgrade-0.11.0-to-0.12.0.oracle.sql a2d0901 metastore/scripts/upgrade/oracle/upgrade.order.oracle PRE-CREATION metastore/scripts/upgrade/postgres/014-HIVE-3764.postgres.sql PRE-CREATION metastore/scripts/upgrade/postgres/hive-schema-0.12.0.postgres.sql 7b319ba metastore/scripts/upgrade/postgres/upgrade-0.11.0-to-0.12.0.postgres.sql 9da0a1b metastore/scripts/upgrade/postgres/upgrade.order.postgres PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 39dda92 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java a27243d metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java e410c3a metastore/src/model/org/apache/hadoop/hive/metastore/model/MVersionTable.java PRE-CREATION metastore/src/model/package.jdo c42b5b0 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java 8066784 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 0f9b16c metastore/src/test/org/apache/hadoop/hive/metastore/TestMetastoreVersion.java PRE-CREATION Diff: https://reviews.apache.org/r/14169/diff/ Testing --- Added new tests for schema verification. Manually tested the upgrades using derby and MySQL. Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-3764: -- Description: Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. was: Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769247#comment-13769247 ] Prasad Mujumdar commented on HIVE-3764: --- New review request for the updated patch at https://reviews.apache.org/r/14169/ Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
Prasad Mujumdar created HIVE-5301: - Summary: Add a schema tool for offline metastore schema upgrade Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
[ https://issues.apache.org/jira/browse/HIVE-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5301: -- Attachment: HIVE-5301-with-HIVE-3764.0.patch Combined HIVE-3764 + HIVE-5301 patch testing Add a schema tool for offline metastore schema upgrade -- Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5301.1.patch, HIVE-5301-with-HIVE-3764.0.patch HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
[ https://issues.apache.org/jira/browse/HIVE-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5301: -- Attachment: HIVE-5301.1.patch Patch attached, requires HIVE-3764 patch Add a schema tool for offline metastore schema upgrade -- Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5301.1.patch, HIVE-5301-with-HIVE-3764.0.patch HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 14170: HIVE-5301: Add a schema tool for offline metastore schema upgrade
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14170/ --- Review request for hive, Ashutosh Chauhan, Brock Noland, and Thejas Nair. Bugs: HIVE-5301 https://issues.apache.org/jira/browse/HIVE-5301 Repository: hive-git Description --- Schema tool to initialize and migrate hive metastore schema - Extract the metastore connection details from hive configuration - the target version is extracted from binary and metastore if possible, optionally can be specified as argument - determine the scripts needs to be executed for the initialization or upgrade - handle DB nested scripts - execute the required scripts using beeline Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java 4c6eb9b beeline/src/java/org/apache/hive/beeline/HiveSchemaHelper.java PRE-CREATION beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java PRE-CREATION beeline/src/test/org/apache/hive/beeline/src/test/TestSchemaTool.java PRE-CREATION bin/ext/schemaTool.sh PRE-CREATION bin/schematool PRE-CREATION build-common.xml ad5ac23 build.xml 3e87163 Diff: https://reviews.apache.org/r/14170/diff/ Testing --- Added unit tests. Manually tested various options using derby and MySQL. Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
[ https://issues.apache.org/jira/browse/HIVE-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5301: -- Status: Patch Available (was: Open) Add a schema tool for offline metastore schema upgrade -- Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5301.1.patch, HIVE-5301-with-HIVE-3764.0.patch HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769252#comment-13769252 ] Prasad Mujumdar commented on HIVE-3764: --- The schema tool part is addressed via HIVE-5301. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4732) Reduce or eliminate the expensive Schema equals() check for AvroSerde
[ https://issues.apache.org/jira/browse/HIVE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769275#comment-13769275 ] Hive QA commented on HIVE-4732: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603500/HIVE-4732.5.patch {color:green}SUCCESS:{color} +1 3126 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/774/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/774/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Reduce or eliminate the expensive Schema equals() check for AvroSerde - Key: HIVE-4732 URL: https://issues.apache.org/jira/browse/HIVE-4732 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Mark Wagner Assignee: Mohammad Kamrul Islam Attachments: HIVE-4732.1.patch, HIVE-4732.4.patch, HIVE-4732.5.patch, HIVE-4732.v1.patch, HIVE-4732.v4.patch The AvroSerde spends a significant amount of time checking schema equality. Changing to compare hashcodes (which can be computed once then reused) will improve performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5211) ALTER TABLE does not change the type of column for a table with AVRO data
[ https://issues.apache.org/jira/browse/HIVE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HIVE-5211: -- Labels: avro (was: ) ALTER TABLE does not change the type of column for a table with AVRO data - Key: HIVE-5211 URL: https://issues.apache.org/jira/browse/HIVE-5211 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Neha Tomar Labels: avro 1 Created a table in Hive with AVRO data. hive CREATE EXTERNAL TABLE sample ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/home/neha/test_data/avrodata' TBLPROPERTIES ('avro.schema.literal'='{type: record,name: TUPLE_3,fields: [ { name: sample_id,type: [ null, int ],doc: autogenerated from Pig Field Schema} ]}' ); OK Time taken: 0.16 seconds hive describe sample; OK sample_id int from deserializer Time taken: 0.516 seconds, Fetched: 1 row(s) Alter the type of column from int to bigint. It displays OK as the result of DDL execution. However, describing the table still shows previous data type. hive alter table sample change sample_id int bigint; OK Time taken: 0.614 seconds hive describe sample; OK sample_id int from deserializer Time taken: 0.4 seconds, Fetched: 1 row(s) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4116) Can't use views using map datatype.
[ https://issues.apache.org/jira/browse/HIVE-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-4116: --- Assignee: Navis Can't use views using map datatype. --- Key: HIVE-4116 URL: https://issues.apache.org/jira/browse/HIVE-4116 Project: Hive Issue Type: Bug Affects Versions: 0.8.1, 0.10.0 Reporter: Karel Vervaeke Assignee: Navis Executing the following {noformat} DROP TABLE IF EXISTS `items`; CREATE TABLE IF NOT EXISTS `items` (id INT, name STRING, info MAPSTRING,STRING) PARTITIONED BY (ds STRING); DROP VIEW IF EXISTS `priceview`; CREATE VIEW `priceview` AS SELECT `items`.`id`, `items`.info['price'] FROM `items` ; select * from `priceview`; {noformat} Produces the following error: {noformat} karel@tomato:~/tmp$ $HIVE_HOME/bin/hive -f hivebug.sql WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/karel/opt/hive-0.10.0-bin/lib/hive-common-0.10.0.jar!/hive-log4j.properties Hive history file=/tmp/karel/hive_job_log_karel_201303051117_945318761.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/karel/opt/hadoop-2.0.0-mr1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/karel/opt/hive-0.10.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. OK Time taken: 5.449 seconds OK Time taken: 0.303 seconds OK Time taken: 0.131 seconds OK Time taken: 0.206 seconds FAILED: SemanticException line 3:22 mismatched input '.' expecting FROM near '`items`' in from clause in definition of VIEW priceview [ SELECT `items`.`id`, `items``items`.`info`info['price'] FROM `default`.`items` ] used as priceview at Line 3:14 {noformat} Unless I'm not using the right syntax, I would expect this simple example to work. I have tried some variations (quotes, no quotes, ...), to no avail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4116) Can't use views using map datatype.
[ https://issues.apache.org/jira/browse/HIVE-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4116: Status: Patch Available (was: Open) Can't use views using map datatype. --- Key: HIVE-4116 URL: https://issues.apache.org/jira/browse/HIVE-4116 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.10.0, 0.8.1 Reporter: Karel Vervaeke Assignee: Navis Executing the following {noformat} DROP TABLE IF EXISTS `items`; CREATE TABLE IF NOT EXISTS `items` (id INT, name STRING, info MAPSTRING,STRING) PARTITIONED BY (ds STRING); DROP VIEW IF EXISTS `priceview`; CREATE VIEW `priceview` AS SELECT `items`.`id`, `items`.info['price'] FROM `items` ; select * from `priceview`; {noformat} Produces the following error: {noformat} karel@tomato:~/tmp$ $HIVE_HOME/bin/hive -f hivebug.sql WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/karel/opt/hive-0.10.0-bin/lib/hive-common-0.10.0.jar!/hive-log4j.properties Hive history file=/tmp/karel/hive_job_log_karel_201303051117_945318761.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/karel/opt/hadoop-2.0.0-mr1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/karel/opt/hive-0.10.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. OK Time taken: 5.449 seconds OK Time taken: 0.303 seconds OK Time taken: 0.131 seconds OK Time taken: 0.206 seconds FAILED: SemanticException line 3:22 mismatched input '.' expecting FROM near '`items`' in from clause in definition of VIEW priceview [ SELECT `items`.`id`, `items``items`.`info`info['price'] FROM `default`.`items` ] used as priceview at Line 3:14 {noformat} Unless I'm not using the right syntax, I would expect this simple example to work. I have tried some variations (quotes, no quotes, ...), to no avail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4116) Can't use views using map datatype.
[ https://issues.apache.org/jira/browse/HIVE-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4116: Affects Version/s: 0.11.0 Can't use views using map datatype. --- Key: HIVE-4116 URL: https://issues.apache.org/jira/browse/HIVE-4116 Project: Hive Issue Type: Bug Affects Versions: 0.8.1, 0.10.0, 0.11.0 Reporter: Karel Vervaeke Assignee: Navis Executing the following {noformat} DROP TABLE IF EXISTS `items`; CREATE TABLE IF NOT EXISTS `items` (id INT, name STRING, info MAPSTRING,STRING) PARTITIONED BY (ds STRING); DROP VIEW IF EXISTS `priceview`; CREATE VIEW `priceview` AS SELECT `items`.`id`, `items`.info['price'] FROM `items` ; select * from `priceview`; {noformat} Produces the following error: {noformat} karel@tomato:~/tmp$ $HIVE_HOME/bin/hive -f hivebug.sql WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/karel/opt/hive-0.10.0-bin/lib/hive-common-0.10.0.jar!/hive-log4j.properties Hive history file=/tmp/karel/hive_job_log_karel_201303051117_945318761.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/karel/opt/hadoop-2.0.0-mr1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/karel/opt/hive-0.10.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. OK Time taken: 5.449 seconds OK Time taken: 0.303 seconds OK Time taken: 0.131 seconds OK Time taken: 0.206 seconds FAILED: SemanticException line 3:22 mismatched input '.' expecting FROM near '`items`' in from clause in definition of VIEW priceview [ SELECT `items`.`id`, `items``items`.`info`info['price'] FROM `default`.`items` ] used as priceview at Line 3:14 {noformat} Unless I'm not using the right syntax, I would expect this simple example to work. I have tried some variations (quotes, no quotes, ...), to no avail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4116) Can't use views using map datatype.
[ https://issues.apache.org/jira/browse/HIVE-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4116: -- Attachment: D12975.1.patch navis requested code review of HIVE-4116 [jira] Can't use views using map datatype.. Reviewers: JIRA HIVE-4116 Cannot use views using map datatype Executing the following DROP TABLE IF EXISTS `items`; CREATE TABLE IF NOT EXISTS `items` (id INT, name STRING, info MAPSTRING,STRING) PARTITIONED BY (ds STRING); DROP VIEW IF EXISTS `priceview`; CREATE VIEW `priceview` AS SELECT `items`.`id`, `items`.info['price'] FROM `items` ; select * from `priceview`; Produces the following error: karel@tomato:~/tmp$ $HIVE_HOME/bin/hive -f hivebug.sql WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/karel/opt/hive-0.10.0-bin/lib/hive-common-0.10.0.jar!/hive-log4j.properties Hive history file=/tmp/karel/hive_job_log_karel_201303051117_945318761.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/karel/opt/hadoop-2.0.0-mr1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/karel/opt/hive-0.10.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. OK Time taken: 5.449 seconds OK Time taken: 0.303 seconds OK Time taken: 0.131 seconds OK Time taken: 0.206 seconds FAILED: SemanticException line 3:22 mismatched input '.' expecting FROM near '`items`' in from clause in definition of VIEW priceview [ SELECT `items`.`id`, `items``items`.`info`info['price'] FROM `default`.`items` ] used as priceview at Line 3:14 Unless I'm not using the right syntax, I would expect this simple example to work. I have tried some variations (quotes, no quotes, ...), to no avail. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12975 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java ql/src/test/queries/clientpositive/create_view_translate.q ql/src/test/results/clientpositive/create_view_translate.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/31011/ To: JIRA, navis Can't use views using map datatype. --- Key: HIVE-4116 URL: https://issues.apache.org/jira/browse/HIVE-4116 Project: Hive Issue Type: Bug Affects Versions: 0.8.1, 0.10.0, 0.11.0 Reporter: Karel Vervaeke Assignee: Navis Attachments: D12975.1.patch Executing the following {noformat} DROP TABLE IF EXISTS `items`; CREATE TABLE IF NOT EXISTS `items` (id INT, name STRING, info MAPSTRING,STRING) PARTITIONED BY (ds STRING); DROP VIEW IF EXISTS `priceview`; CREATE VIEW `priceview` AS SELECT `items`.`id`, `items`.info['price'] FROM `items` ; select * from `priceview`; {noformat} Produces the following error: {noformat} karel@tomato:~/tmp$ $HIVE_HOME/bin/hive -f hivebug.sql WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/karel/opt/hive-0.10.0-bin/lib/hive-common-0.10.0.jar!/hive-log4j.properties Hive history file=/tmp/karel/hive_job_log_karel_201303051117_945318761.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/karel/opt/hadoop-2.0.0-mr1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/karel/opt/hive-0.10.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. OK Time taken: 5.449 seconds OK Time taken: 0.303 seconds OK Time taken: 0.131 seconds OK Time taken: 0.206 seconds FAILED: SemanticException line 3:22 mismatched input '.' expecting FROM near '`items`' in from clause in definition of VIEW priceview [ SELECT `items`.`id`, `items``items`.`info`info['price'] FROM `default`.`items` ] used as priceview at Line 3:14 {noformat} Unless I'm not using the right syntax, I would expect this simple example to work. I have tried some variations (quotes, no quotes, ...), to no avail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA,
[jira] [Resolved] (HIVE-4173) Hive Ingnoring where clause for multitable insert
[ https://issues.apache.org/jira/browse/HIVE-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-4173. - Resolution: Duplicate Hive Ingnoring where clause for multitable insert - Key: HIVE-4173 URL: https://issues.apache.org/jira/browse/HIVE-4173 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.1, 0.9.0 Environment: Red Hat Enterprise Linux Server release 6.3 (Santiago), Reporter: hussain Priority: Critical Hive is ignoring Filter conditions given at Multi Insert select statement when Filtering given on Source Query.. To highlight this issue, please see below example with where clause (status!='C') from employee12 table causing issue and due to which insert filters (batch_id='12 and batch_id!='12' )not working and dumping all the data coming from source to both the tables. I have checked the hive execution plan, and didn't find Filter predicates under for filtering record per insert statements from (from employee12 select * where status!='C') t insert into table employee1 select status, field1, 'T' as field2, 'P' as field3, 'C' as field4 where batch_id='12' insert into table employee2 select status, field1, 'D' as field2, 'P' as field3, 'C' as field4 where batch_id!='12'; It is working fine with single insert. Hive generating plan properly.. I am able to reproduce this issue with 8.1 and 9.0 version of Hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5122) Add partition for multiple partition ignores locations for non-first partitions
[ https://issues.apache.org/jira/browse/HIVE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769302#comment-13769302 ] Thejas M Nair commented on HIVE-5122: - Looks good. +1 Add partition for multiple partition ignores locations for non-first partitions --- Key: HIVE-5122 URL: https://issues.apache.org/jira/browse/HIVE-5122 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: D12411.3.patch, D12411.4.patch, HIVE-5122.D12411.1.patch, HIVE-5122.D12411.2.patch http://www.mail-archive.com/user@hive.apache.org/msg09151.html When multiple partitions are being added in single alter table statement, the location for first partition is being used as the location of all partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5296) Memory leak: OOM Error after multiple open/closed JDBC connections.
[ https://issues.apache.org/jira/browse/HIVE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769320#comment-13769320 ] Douglas commented on HIVE-5296: --- Hi Kousuke, Thanks for your interest. Here are the answers to your questions: 1) This is most definitely the Hiverserver2 process. I validated this by tracking the heap space utilised for the hiveserver2 process over time, as connections were opened and closed. 2) The queries that were being executed were for the most part LOAD DATA INPATH: {code} int returnCode = hc.update(LOAD DATA INPATH \'+fileName +\' + OVERWRITE INTO TABLE +targetTable+ partition (dt=\'+cal.getTimeInMillis()+\')); logger.info(this.getClass()+ returned with value +returnCode); {code} These were a mix of successes, and exceptions. I've yet to validate if the leak occurs in all instances, or in those cases where the hiveserver throws Exceptions. 3) I've not had the time to dig into the hiveserver code as yet to find the offending object, but if I do get the chance, I will certainly post my findings and a patch. Memory leak: OOM Error after multiple open/closed JDBC connections. Key: HIVE-5296 URL: https://issues.apache.org/jira/browse/HIVE-5296 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Environment: Hive 0.12.0, Hadoop 1.1.2, Debian. Reporter: Douglas Labels: hiveserver Fix For: 0.12.0 Original Estimate: 168h Remaining Estimate: 168h This error seems to relate to https://issues.apache.org/jira/browse/HIVE-3481 However, on inspection of the related patch and my built version of Hive (patch carried forward to 0.12.0), I am still seeing the described behaviour. Multiple connections to Hiveserver2, all of which are closed and disposed of properly show the Java heap size to grow extremely quickly. This issue can be recreated using the following code {code} import java.sql.DriverManager; import java.sql.Connection; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; import java.util.Properties; import org.apache.hive.service.cli.HiveSQLException; import org.apache.log4j.Logger; /* * Class which encapsulates the lifecycle of a query or statement. * Provides functionality which allows you to create a connection */ public class HiveClient { Connection con; Logger logger; private static String driverName = org.apache.hive.jdbc.HiveDriver; private String db; public HiveClient(String db) { logger = Logger.getLogger(HiveClient.class); this.db=db; try{ Class.forName(driverName); }catch(ClassNotFoundException e){ logger.info(Can't find Hive driver); } String hiveHost = GlimmerServer.config.getString(hive/host); String hivePort = GlimmerServer.config.getString(hive/port); String connectionString = jdbc:hive2://+hiveHost+:+hivePort +/default; logger.info(String.format(Attempting to connect to %s,connectionString)); try{ con = DriverManager.getConnection(connectionString,,); }catch(Exception e){ logger.error(Problem instantiating the connection+e.getMessage()); } } public int update(String query) { Integer res = 0; Statement stmt = null; try{ stmt = con.createStatement(); String switchdb = USE +db; logger.info(switchdb); stmt.executeUpdate(switchdb); logger.info(query); res = stmt.executeUpdate(query); logger.info(Query passed to server); stmt.close(); }catch(HiveSQLException e){ logger.info(String.format(HiveSQLException thrown, this can be valid, + but check the error: %s from the query %s,query,e.toString())); }catch(SQLException e){ logger.error(String.format(Unable to execute query SQLException %s. Error: %s,query,e)); }catch(Exception e){
[jira] [Created] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
Sean Busbey created HIVE-5302: - Summary: PartitionPruner fails on Avro non-partitioned data Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Priority: Blocker While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5297) Hive does not honor type for partition columns
[ https://issues.apache.org/jira/browse/HIVE-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769334#comment-13769334 ] Hive QA commented on HIVE-5297: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603502/HIVE-5297.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 3128 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_type_check org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_table_add_partition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_failure5 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/775/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/775/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. Hive does not honor type for partition columns -- Key: HIVE-5297 URL: https://issues.apache.org/jira/browse/HIVE-5297 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5297.1.patch, HIVE-5297.2.patch Hive does not consider the type of the partition column while writing partitions. Consider for example the query: {noformat} create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); {noformat} Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. We should throw an exception on such user error at the time the partition addition/load happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5297) Hive does not honor type for partition columns
[ https://issues.apache.org/jira/browse/HIVE-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5297: - Status: Open (was: Patch Available) Hive does not honor type for partition columns -- Key: HIVE-5297 URL: https://issues.apache.org/jira/browse/HIVE-5297 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5297.1.patch, HIVE-5297.2.patch, HIVE-5297.3.patch Hive does not consider the type of the partition column while writing partitions. Consider for example the query: {noformat} create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); {noformat} Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. We should throw an exception on such user error at the time the partition addition/load happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5297) Hive does not honor type for partition columns
[ https://issues.apache.org/jira/browse/HIVE-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5297: - Attachment: HIVE-5297.3.patch Fix failing tests. The type of error changed for the negative tests. Hive does not honor type for partition columns -- Key: HIVE-5297 URL: https://issues.apache.org/jira/browse/HIVE-5297 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5297.1.patch, HIVE-5297.2.patch, HIVE-5297.3.patch Hive does not consider the type of the partition column while writing partitions. Consider for example the query: {noformat} create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); {noformat} Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. We should throw an exception on such user error at the time the partition addition/load happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 14155: HIVE-5297 Hive does not honor type for partition columns
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14155/ --- (Updated Sept. 17, 2013, 9:08 a.m.) Review request for hive and Ashutosh Chauhan. Changes --- Updated test results. Bugs: HIVE-5297 https://issues.apache.org/jira/browse/HIVE-5297 Repository: hive-git Description --- Hive does not consider the type of the partition column while writing partitions. Consider for example the query: create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1af68a6 ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 393ef57 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2ece97e ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java a704462 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java fb79823 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g ca667d4 ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 767f545 ql/src/test/queries/clientnegative/illegal_partition_type.q PRE-CREATION ql/src/test/queries/clientnegative/illegal_partition_type2.q PRE-CREATION ql/src/test/queries/clientpositive/partition_type_check.q PRE-CREATION ql/src/test/results/clientnegative/alter_table_add_partition.q.out bd9c148 ql/src/test/results/clientnegative/alter_view_failure5.q.out 4edb82c ql/src/test/results/clientnegative/illegal_partition_type.q.out PRE-CREATION ql/src/test/results/clientnegative/illegal_partition_type2.q.out PRE-CREATION ql/src/test/results/clientpositive/parititon_type_check.q.out PRE-CREATION ql/src/test/results/clientpositive/partition_type_check.q.out PRE-CREATION Diff: https://reviews.apache.org/r/14155/diff/ Testing --- Ran all tests. Thanks, Vikram Dixit Kumaraswamy
[jira] [Updated] (HIVE-5297) Hive does not honor type for partition columns
[ https://issues.apache.org/jira/browse/HIVE-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5297: - Status: Patch Available (was: Open) Hive does not honor type for partition columns -- Key: HIVE-5297 URL: https://issues.apache.org/jira/browse/HIVE-5297 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5297.1.patch, HIVE-5297.2.patch, HIVE-5297.3.patch Hive does not consider the type of the partition column while writing partitions. Consider for example the query: {noformat} create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); {noformat} Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. We should throw an exception on such user error at the time the partition addition/load happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
HBase Tables Join in Hive
Hi, I am having 2 tables in HBase. Table1: data, userID ( This is very big table) Table2: userID, userDetails ( This is a smaller table) We need to join both the table on userID column and perform some queries. Our idea is to map both Table1 and Table2 in Hive using HBaseStorageHandler. Does Hive support JOIN also on these HBase mapped tables? Regards, Kiran
[jira] [Commented] (HIVE-4998) support jdbc documented table types in default configuration
[ https://issues.apache.org/jira/browse/HIVE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769432#comment-13769432 ] Hudson commented on HIVE-4998: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #101 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/101/]) HIVE-4998 support jdbc documented table types in default configuration (Thejas Nair via Harish Butani) (rhbutani: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1523741) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java support jdbc documented table types in default configuration Key: HIVE-4998 URL: https://issues.apache.org/jira/browse/HIVE-4998 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4998.1.patch The jdbc table types supported by hive server2 are not the documented typical types [1] in jdbc, they are hive specific types (MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW). HIVE-4573 added support for the jdbc documented typical types, but the HS2 default configuration is to return the hive types The default configuration should result in the expected jdbc typical behavior. [1] http://docs.oracle.com/javase/6/docs/api/java/sql/DatabaseMetaData.html?is-external=true#getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4998) support jdbc documented table types in default configuration
[ https://issues.apache.org/jira/browse/HIVE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769467#comment-13769467 ] Hudson commented on HIVE-4998: -- ABORTED: Integrated in Hive-trunk-hadoop2 #435 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/435/]) HIVE-4998 support jdbc documented table types in default configuration (Thejas Nair via Harish Butani) (rhbutani: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1523741) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java support jdbc documented table types in default configuration Key: HIVE-4998 URL: https://issues.apache.org/jira/browse/HIVE-4998 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4998.1.patch The jdbc table types supported by hive server2 are not the documented typical types [1] in jdbc, they are hive specific types (MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW). HIVE-4573 added support for the jdbc documented typical types, but the HS2 default configuration is to return the hive types The default configuration should result in the expected jdbc typical behavior. [1] http://docs.oracle.com/javase/6/docs/api/java/sql/DatabaseMetaData.html?is-external=true#getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4605) hive job fails when insert overwrite ORC table
[ https://issues.apache.org/jira/browse/HIVE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769472#comment-13769472 ] Joe Travaglini commented on HIVE-4605: -- Brock, I've also been seeing this same symptom several times over the past month, but with RCFile and not ORC. I also cannot reliably reproduce, but it's happening. See also http://mail-archives.apache.org/mod_mbox/hive-user/201306.mbox/%3ccansfgrkr0jy5w3ey3z8awtwpyphgu5yedicybvjbnwr_o_5...@mail.gmail.com%3E which seems like the same symptom. Hive 0.10 in CDH4.3.1 hive job fails when insert overwrite ORC table -- Key: HIVE-4605 URL: https://issues.apache.org/jira/browse/HIVE-4605 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Environment: OS: 2.6.18-194.el5xen #1 SMP Fri Apr 2 15:34:40 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux Hadoop 1.1.2 Reporter: Link Qian Assignee: Brock Noland 1, create a table with ORC storage model create table iparea_analysis_orc (network int, ip string, ) stored as ORC; 2, insert table iparea_analysis_orc select network, ip, , the script success, but failed after add *OVERWRITE* keyword. the main error log list as here. ava.lang.RuntimeException: Hive Runtime Error while closing operators: Unable to rename output from: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0 to: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0 at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0 to: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:197) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:108) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:867) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309) ... 7 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
did you always have to log in to phabricator
I never remeber having to log into phabricator to view a patch. Has this changed recently? I believe that having to create an external account to view a patch in progress is not something we should be doing.
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769546#comment-13769546 ] Brock Noland commented on HIVE-3764: FYI it looks like you tried to delete HIVE-3764.4.patch but it's still there? Anyway based on the date it looks like HIVE-3764.1.patch is the current patch. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5292) Join on decimal columns fails to return rows
[ https://issues.apache.org/jira/browse/HIVE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5292: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! [~thejas] I will recommend inclusion of this bug fix in 0.12 as well. Join on decimal columns fails to return rows Key: HIVE-5292 URL: https://issues.apache.org/jira/browse/HIVE-5292 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: Linux lnxx64r5 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: Sergio Lob Assignee: Navis Fix For: 0.13.0 Attachments: D12969.1.patch Join on matching decimal columns returns 0 rows To reproduce (I used beeline): 1. create 2 simple identical tables with 2 identical rows: CREATE TABLE SERGDEC(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; CREATE TABLE SERGDEC2(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; 2. populate tables with identical data: LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC ; LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC2 ; 3. data file decdata contains: 10|.98 20|1234567890.1234 4. Perform join (returns 0 rows instead of 2): SELECT T1.I, T1.D, T2.D FROM SERGDEC T1 JOIN SERGDEC2 T2 ON T1.D = T2.D ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: did you always have to log in to phabricator
Personally I prefer Review Board. On Tue, Sep 17, 2013 at 8:31 AM, Edward Capriolo edlinuxg...@gmail.com wrote: I never remeber having to log into phabricator to view a patch. Has this changed recently? I believe that having to create an external account to view a patch in progress is not something we should be doing. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Updated] (HIVE-5285) Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
[ https://issues.apache.org/jira/browse/HIVE-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5285: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Hari! [~thejas] Will recommend inclusion of this and HIVE-5199 in 0.12 branch. Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors. --- Key: HIVE-5285 URL: https://issues.apache.org/jira/browse/HIVE-5285 Project: Hive Issue Type: Bug Affects Versions: 0.11.1 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Fix For: 0.13.0 Attachments: HIVE-5285.1.patch.txt, HIVE-5285.2.patch.txt The approach for HIVE-5199 fix is correct.However, the fix for HIVE-5199 is incomplete. Consider a complex nested structure containing the following object inspector hierarchy: SettableStructObjectInspector { ListObjectInspectorNonSettableStructObjectInspector } In the above case, the cast exception can happen via MapOperator/FetchOperator as below: java.io.IOException: java.lang.ClassCastException: com.skype.data.hadoop.hive.proto.CustomObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:545) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1412) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassCastException: com.skype.data.whaleshark.hadoop.hive.proto.ProtoMapObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:144) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:294) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:138) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$ListConverter.convert(ObjectInspectorConverters.java:251) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:316) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:529) ... 13 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769559#comment-13769559 ] Brock Noland commented on HIVE-5294: This looks good! I wander about the aggregation buffer constructor, specifically: Log.error(buffer type was null); Won't this lead to a NPE later? If so, should we just throw a RuntimeException? Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5088) Fix udf_translate.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769562#comment-13769562 ] Ashutosh Chauhan commented on HIVE-5088: Just to mention underlying problem in Hadoop is already committed via HADOOP-9801 which will be available in 1.3.0 and 2.1.1-beta, so we may need not to put a workaround in Hive for this. Fix udf_translate.q on Windows -- Key: HIVE-5088 URL: https://issues.apache.org/jira/browse/HIVE-5088 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5088-1.patch Test failed with message: [junit] Begin query: udf_translate.q [junit] 13/08/14 03:23:57 FATAL conf.Configuration: error parsing conf file: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. [junit] Exception in thread main java.lang.RuntimeException: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. [junit] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1255) [junit] at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1117) [junit] at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1053) [junit] at org.apache.hadoop.conf.Configuration.get(Configuration.java:397) [junit] at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:594) [junit] at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1015) [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:659) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:160) [junit] Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. [junit] at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684) [junit] at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554) [junit] at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742) [junit] at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487) [junit] at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2688) [junit] at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:647) [junit] at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) [junit] at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) [junit] at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) [junit] at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) [junit] at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) [junit] at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:232) [junit] at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) [junit] at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) [junit] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) [junit] ... 11 more [junit] Exception: Client Execution failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 1 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:122) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_translate(TestCliDriver.java:104) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769586#comment-13769586 ] Edward Capriolo commented on HIVE-5294: --- Yes that should throw at runtime. That was something left over from testing. Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5246) Local task for map join submitted via oozie job fails on a secure HDFS
[ https://issues.apache.org/jira/browse/HIVE-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5246: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thank you for your contribution! Local task for map join submitted via oozie job fails on a secure HDFS --- Key: HIVE-5246 URL: https://issues.apache.org/jira/browse/HIVE-5246 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.13.0 Attachments: HIVE-5246.1.patch, HIVE-5246-test.tar For a Hive query started by Oozie Hive action, the local task submitted for Mapjoin fails. The HDFS delegation token is not shared properly with the child JVM created for the local task. Oozie creates a delegation token for the Hive action and sets env variable HADOOP_TOKEN_FILE_LOCATION as well as mapreduce.job.credentials.binary config property. However this doesn't get passed down to the child JVM which causes the problem. This is similar issue addressed by HIVE-4343 which address the problem HiveServer2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5279) Kryo cannot instantiate GenericUDAFEvaluator in GroupByDesc
[ https://issues.apache.org/jira/browse/HIVE-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769582#comment-13769582 ] Ashutosh Chauhan commented on HIVE-5279: With latest patch tests in TestCliDriver TestContribCliDriver tests are no more failing, so thats a progress. However, 4 tests in TestParse are still failing, as previously, namely {{groupby1.q}} {{groupby2.q}} {{groupby3.q}} and {{groupby5.q}} Kryo cannot instantiate GenericUDAFEvaluator in GroupByDesc --- Key: HIVE-5279 URL: https://issues.apache.org/jira/browse/HIVE-5279 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: 5279.patch, D12963.1.patch, D12963.2.patch, D12963.3.patch We didn't forced GenericUDAFEvaluator to be Serializable. I don't know how previous serialization mechanism solved this but, kryo complaints that it's not Serializable and fails the query. The log below is the example, {noformat} java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector Serialization trace: inputOI (org.apache.hadoop.hive.ql.udf.generic.GenericUDAFGroupOn$VersionedFloatGroupOnEval) genericUDAFEvaluator (org.apache.hadoop.hive.ql.plan.AggregationDesc) aggregators (org.apache.hadoop.hive.ql.plan.GroupByDesc) conf (org.apache.hadoop.hive.ql.exec.GroupByOperator) childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:312) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:261) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:256) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:383) at org.apache.h {noformat} If this cannot be fixed in somehow, some UDAFs should be modified to be run on hive-0.13.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769634#comment-13769634 ] Brock Noland commented on HIVE-3764: It was my fault! I removed it. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769630#comment-13769630 ] Prasad Mujumdar commented on HIVE-3764: --- [~brocknoland] yes, HIVE-3764.1.patch is the latest. The .4.patch left in there was the one that you added (to refresh the correct patch for test run) hence I couldn't remove that. sorry about the confusion. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.4.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-3764: --- Attachment: (was: HIVE-3764.4.patch) Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-5294: -- Attachment: HIVE-5294.1.patch.txt Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769621#comment-13769621 ] Edward Capriolo commented on HIVE-5294: --- .1 throws Runtime exception (which we should never hit anyway) Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4998) support jdbc documented table types in default configuration
[ https://issues.apache.org/jira/browse/HIVE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769639#comment-13769639 ] Hudson commented on HIVE-4998: -- SUCCESS: Integrated in Hive-trunk-h0.21 #2337 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2337/]) HIVE-4998 support jdbc documented table types in default configuration (Thejas Nair via Harish Butani) (rhbutani: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1523741) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java support jdbc documented table types in default configuration Key: HIVE-4998 URL: https://issues.apache.org/jira/browse/HIVE-4998 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4998.1.patch The jdbc table types supported by hive server2 are not the documented typical types [1] in jdbc, they are hive specific types (MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW). HIVE-4573 added support for the jdbc documented typical types, but the HS2 default configuration is to return the hive types The default configuration should result in the expected jdbc typical behavior. [1] http://docs.oracle.com/javase/6/docs/api/java/sql/DatabaseMetaData.html?is-external=true#getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: did you always have to log in to phabricator
Yeah. I used to be able to view w/o login, but now I am not. On Tue, Sep 17, 2013 at 7:27 AM, Brock Noland br...@cloudera.com wrote: Personally I prefer Review Board. On Tue, Sep 17, 2013 at 8:31 AM, Edward Capriolo edlinuxg...@gmail.com wrote: I never remeber having to log into phabricator to view a patch. Has this changed recently? I believe that having to create an external account to view a patch in progress is not something we should be doing. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769654#comment-13769654 ] Brock Noland commented on HIVE-5294: Agreed. This looks good to me. I plan on committing it if tests pass. Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769688#comment-13769688 ] Ashutosh Chauhan commented on HIVE-4113: In addition to what [~yhuai] suggested for RCFile, similar enhancement exist for ORC as well, as ORC stores stats (including counts) per stripe which will allow us to do almost no IO, but I will say that those enhancements will likely require changes in query processing code, so I will consider them out of scope for this jira. Lets get this one in and take up enhancements in follow-up. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4173) Hive Ignoring where clause for multitable insert
[ https://issues.apache.org/jira/browse/HIVE-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4173: -- Summary: Hive Ignoring where clause for multitable insert (was: Hive Ingnoring where clause for multitable insert) Hive Ignoring where clause for multitable insert Key: HIVE-4173 URL: https://issues.apache.org/jira/browse/HIVE-4173 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.1, 0.9.0 Environment: Red Hat Enterprise Linux Server release 6.3 (Santiago), Reporter: hussain Priority: Critical Hive is ignoring Filter conditions given at Multi Insert select statement when Filtering given on Source Query.. To highlight this issue, please see below example with where clause (status!='C') from employee12 table causing issue and due to which insert filters (batch_id='12 and batch_id!='12' )not working and dumping all the data coming from source to both the tables. I have checked the hive execution plan, and didn't find Filter predicates under for filtering record per insert statements from (from employee12 select * where status!='C') t insert into table employee1 select status, field1, 'T' as field2, 'P' as field3, 'C' as field4 where batch_id='12' insert into table employee2 select status, field1, 'D' as field2, 'P' as field3, 'C' as field4 where batch_id!='12'; It is working fine with single insert. Hive generating plan properly.. I am able to reproduce this issue with 8.1 and 9.0 version of Hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769693#comment-13769693 ] Brock Noland commented on HIVE-4113: Agreed. Unfortunately I won't have time to take this up in the next few days so if someone has time and would like to see this in soon I'd be more than willing to hand it off. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5298) AvroSerde performance problem caused by HIVE-3833
[ https://issues.apache.org/jira/browse/HIVE-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5298: -- Status: Patch Available (was: Open) AvroSerde performance problem caused by HIVE-3833 - Key: HIVE-5298 URL: https://issues.apache.org/jira/browse/HIVE-5298 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5298.1.patch, HIVE-5298.patch HIVE-3833 fixed the targeted problem and made Hive to use partition-level metadata to initialize object inspector. In doing that, however, it goes thru every file under the table to access the partition metadata, which is very inefficient, especially in case of multiple files per partition. This causes more problem for AvroSerde because AvroSerde initialization accesses schema, which is located on file system. As a result, before hive can process any data, it needs to access every file for a table, which can take long enough to cause job failure because of lack of job progress. The improvement can be made so that partition metadata is only access once per partition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769709#comment-13769709 ] Hive QA commented on HIVE-5294: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603607/HIVE-5294.1.patch.txt {color:green}SUCCESS:{color} +1 3126 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/783/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/783/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5298) AvroSerde performance problem caused by HIVE-3833
[ https://issues.apache.org/jira/browse/HIVE-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5298: -- Attachment: HIVE-5298.1.patch Update the patch based on test result. Note that no test is added for this due to the nature of the issue. However, I will do manual testing and with update with the result. AvroSerde performance problem caused by HIVE-3833 - Key: HIVE-5298 URL: https://issues.apache.org/jira/browse/HIVE-5298 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5298.1.patch, HIVE-5298.patch HIVE-3833 fixed the targeted problem and made Hive to use partition-level metadata to initialize object inspector. In doing that, however, it goes thru every file under the table to access the partition metadata, which is very inefficient, especially in case of multiple files per partition. This causes more problem for AvroSerde because AvroSerde initialization accesses schema, which is located on file system. As a result, before hive can process any data, it needs to access every file for a table, which can take long enough to cause job failure because of lack of job progress. The improvement can be made so that partition metadata is only access once per partition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: did you always have to log in to phabricator
I do not like this. It is inconvenience when using a mobile device, but more importantly it does not seem very transparent to our end users. For example, a user is browsing jira they may want to review the code only on review board (not yet attached to the issue), they should not be forced to sign up to help in the process. Would anyone from facebook care to chime in here? I think we all like fabricator for the most part. Our docs suggest this fabricator is our de-facto review system. As an ASF project I do not think requiring a login on some external service even to review a jira is correct. On Tue, Sep 17, 2013 at 12:27 PM, Xuefu Zhang xzh...@cloudera.com wrote: Yeah. I used to be able to view w/o login, but now I am not. On Tue, Sep 17, 2013 at 7:27 AM, Brock Noland br...@cloudera.com wrote: Personally I prefer Review Board. On Tue, Sep 17, 2013 at 8:31 AM, Edward Capriolo edlinuxg...@gmail.com wrote: I never remeber having to log into phabricator to view a patch. Has this changed recently? I believe that having to create an external account to view a patch in progress is not something we should be doing. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-5292) Join on decimal columns fails to return rows
[ https://issues.apache.org/jira/browse/HIVE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769748#comment-13769748 ] Hudson commented on HIVE-5292: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #102 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/102/]) HIVE-5292 : Join on decimal columns fails to return rows (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524062) * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java * /hive/trunk/ql/src/test/queries/clientpositive/decimal_join.q * /hive/trunk/ql/src/test/results/clientpositive/decimal_join.q.out Join on decimal columns fails to return rows Key: HIVE-5292 URL: https://issues.apache.org/jira/browse/HIVE-5292 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: Linux lnxx64r5 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: Sergio Lob Assignee: Navis Fix For: 0.13.0 Attachments: D12969.1.patch Join on matching decimal columns returns 0 rows To reproduce (I used beeline): 1. create 2 simple identical tables with 2 identical rows: CREATE TABLE SERGDEC(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; CREATE TABLE SERGDEC2(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; 2. populate tables with identical data: LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC ; LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC2 ; 3. data file decdata contains: 10|.98 20|1234567890.1234 4. Perform join (returns 0 rows instead of 2): SELECT T1.I, T1.D, T2.D FROM SERGDEC T1 JOIN SERGDEC2 T2 ON T1.D = T2.D ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5246) Local task for map join submitted via oozie job fails on a secure HDFS
[ https://issues.apache.org/jira/browse/HIVE-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769747#comment-13769747 ] Hudson commented on HIVE-5246: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #102 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/102/]) HIVE-5246 - Local task for map join submitted via oozie job fails on a secure HDFS (Prasad Mujumdar via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524074) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SecureCmdDoAs.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java Local task for map join submitted via oozie job fails on a secure HDFS --- Key: HIVE-5246 URL: https://issues.apache.org/jira/browse/HIVE-5246 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.13.0 Attachments: HIVE-5246.1.patch, HIVE-5246-test.tar For a Hive query started by Oozie Hive action, the local task submitted for Mapjoin fails. The HDFS delegation token is not shared properly with the child JVM created for the local task. Oozie creates a delegation token for the Hive action and sets env variable HADOOP_TOKEN_FILE_LOCATION as well as mapreduce.job.credentials.binary config property. However this doesn't get passed down to the child JVM which causes the problem. This is similar issue addressed by HIVE-4343 which address the problem HiveServer2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5297) Hive does not honor type for partition columns
[ https://issues.apache.org/jira/browse/HIVE-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769729#comment-13769729 ] Sergey Shelukhin commented on HIVE-5297: There are open comments remaining... one should be a straightforward code change Hive does not honor type for partition columns -- Key: HIVE-5297 URL: https://issues.apache.org/jira/browse/HIVE-5297 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5297.1.patch, HIVE-5297.2.patch, HIVE-5297.3.patch Hive does not consider the type of the partition column while writing partitions. Consider for example the query: {noformat} create table tab1 (id1 int, id2 string) PARTITIONED BY(month string,day int) row format delimited fields terminated by ','; alter table tab1 add partition (month='June', day='second'); {noformat} Hive accepts this query. However if you try to select from this table and insert into another expecting schema match, it will insert nulls instead. We should throw an exception on such user error at the time the partition addition/load happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5271) Convert join op to a map join op in the planning phase
[ https://issues.apache.org/jira/browse/HIVE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5271: - Attachment: HIVE-5271.WIP.patch WIP patch. Convert join op to a map join op in the planning phase -- Key: HIVE-5271 URL: https://issues.apache.org/jira/browse/HIVE-5271 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5271.WIP.patch This captures the planning changes required in hive to support hash joins. We need to convert the join operator to a map join operator. This is hooked into the infrastructure provided by HIVE-5095. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769730#comment-13769730 ] Yin Huai commented on HIVE-4113: Let me take a look. Seems only a few minor changes are needed for Brock's patch. One thing I need to make sure is if we populate all columns in the list of needed columns for select * from. If so, we will not need hive.io.file.read.all.columns. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769763#comment-13769763 ] Eugene Koifman commented on HIVE-4531: -- Could you open a review board so we can embed comments next to the code they refer to? [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769740#comment-13769740 ] Ashutosh Chauhan commented on HIVE-4113: Thanks [~yhuai] for volunteering. Assigning it to you. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4113: --- Assignee: Yin Huai (was: Brock Noland) Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Yin Huai Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769742#comment-13769742 ] Eugene Koifman commented on HIVE-4531: -- I realized I missed LogRetriever in the review: 1. It opens URLConnection in several places but doesn't close them. 2. Is this class meant to be used anywhere other than TempletonControllerJob? If no, can it be moved to the same package and be made package private (To reduce public API footprint)? Similarly, could all member variables/methods be made as private as possible? 3. I think it would be really useful to add some higher level documentation about the design. Why does this class exist? why does it parse JSPs, where does it write the result, etc. I think 1 or 2 paragraphs would be sufficient. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5292) Join on decimal columns fails to return rows
[ https://issues.apache.org/jira/browse/HIVE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769776#comment-13769776 ] Hudson commented on HIVE-5292: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #169 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/169/]) HIVE-5292 : Join on decimal columns fails to return rows (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524062) * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java * /hive/trunk/ql/src/test/queries/clientpositive/decimal_join.q * /hive/trunk/ql/src/test/results/clientpositive/decimal_join.q.out Join on decimal columns fails to return rows Key: HIVE-5292 URL: https://issues.apache.org/jira/browse/HIVE-5292 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: Linux lnxx64r5 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: Sergio Lob Assignee: Navis Fix For: 0.13.0 Attachments: D12969.1.patch Join on matching decimal columns returns 0 rows To reproduce (I used beeline): 1. create 2 simple identical tables with 2 identical rows: CREATE TABLE SERGDEC(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; CREATE TABLE SERGDEC2(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; 2. populate tables with identical data: LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC ; LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC2 ; 3. data file decdata contains: 10|.98 20|1234567890.1234 4. Perform join (returns 0 rows instead of 2): SELECT T1.I, T1.D, T2.D FROM SERGDEC T1 JOIN SERGDEC2 T2 ON T1.D = T2.D ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5246) Local task for map join submitted via oozie job fails on a secure HDFS
[ https://issues.apache.org/jira/browse/HIVE-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769775#comment-13769775 ] Hudson commented on HIVE-5246: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #169 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/169/]) HIVE-5246 - Local task for map join submitted via oozie job fails on a secure HDFS (Prasad Mujumdar via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524074) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SecureCmdDoAs.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java Local task for map join submitted via oozie job fails on a secure HDFS --- Key: HIVE-5246 URL: https://issues.apache.org/jira/browse/HIVE-5246 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.13.0 Attachments: HIVE-5246.1.patch, HIVE-5246-test.tar For a Hive query started by Oozie Hive action, the local task submitted for Mapjoin fails. The HDFS delegation token is not shared properly with the child JVM created for the local task. Oozie creates a delegation token for the Hive action and sets env variable HADOOP_TOKEN_FILE_LOCATION as well as mapreduce.job.credentials.binary config property. However this doesn't get passed down to the child JVM which causes the problem. This is similar issue addressed by HIVE-4343 which address the problem HiveServer2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5285) Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
[ https://issues.apache.org/jira/browse/HIVE-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769774#comment-13769774 ] Hudson commented on HIVE-5285: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #169 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/169/]) HIVE-5285 : Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.(Hari Sankar via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524067) * /hive/trunk/ql/src/test/org/apache/hadoop/hive/serde2/CustomSerDe3.java * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat17.q * /hive/trunk/ql/src/test/results/clientpositive/partition_wise_fileformat17.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors. --- Key: HIVE-5285 URL: https://issues.apache.org/jira/browse/HIVE-5285 Project: Hive Issue Type: Bug Affects Versions: 0.11.1 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Fix For: 0.13.0 Attachments: HIVE-5285.1.patch.txt, HIVE-5285.2.patch.txt The approach for HIVE-5199 fix is correct.However, the fix for HIVE-5199 is incomplete. Consider a complex nested structure containing the following object inspector hierarchy: SettableStructObjectInspector { ListObjectInspectorNonSettableStructObjectInspector } In the above case, the cast exception can happen via MapOperator/FetchOperator as below: java.io.IOException: java.lang.ClassCastException: com.skype.data.hadoop.hive.proto.CustomObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:545) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1412) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassCastException: com.skype.data.whaleshark.hadoop.hive.proto.ProtoMapObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:144) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:294) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:138) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$ListConverter.convert(ObjectInspectorConverters.java:251) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:316) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:529) ... 13 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Interesting claims that seem untrue
Carter, what you are doing is essentially contradict ASF policy of community over code. Perhaps, your intentions are good. However, LOC calculations or other silly contests are essentially driving a wedge between developers who happen to draw their paycheck from different commercial entities. Hadoop community passed through this already and it caused nothing but despair and bitterness between the people. Unlike some other popular contests, the number of lines contributed doesn't matter for most. Seriously. Regards, Cos On Mon, Sep 16, 2013 at 01:58PM, Carter Shanklin wrote: Ed, If nothing else I'm glad it was interesting enough to generate some discussion. These sorts of stats are always subjects of a lot of controversy. I have seen a lot of these sorts of charts float around in confidential slide decks and I think it's good to have them out in the open where anyone can critique and correct them. In this case Ed, you've pointed out a legitimate flaw in my analysis. Doing the analysis again I found that previously, due to a bug in my scripts, JIRAs that didn't have Hudson comments in them were not counted (this was one way it was identifying SVN commit IDs which I have since removed due to flakiness). Brock's patch was the single largest victim of this bug but not the only one, there were some from Cloudera, NexR, Hortonworks, Facebook even 2 from you Ed. The interested can see a full list of exclusions here: https://docs.google.com/spreadsheet/ccc?key=0ArmXd5zzNQm5dDJTMkFtaUk2d0dyU3hnWGJCcUczbXc#gid=0. I apologize to those under-represented, there wasn't any intent on my part to minimize anyone's work. The impact in final totals is Cloudera +5.4%, NexR +0.8%, Facebook -2.7%, Hortonworks -3.3%. I will be updating the blog later today with relevant corrections. There is going to be continued interest in seeing charts like these, for example when Hive 12 is officially done. Sanjay suggested that LoC counts may not be the best way to represent true contribution. I agree that not all lines of code are created equal, for example a few monster patches recently went in re-arranging HCatalog namespaces and I think also indentation style. This (hopefully) mechanical work is not on the same footing as adding new query language features. Still it is work and wouldn't be fair to pretend it didn't happen. If anyone has ideas on better ways to fairly capture contribution I'm open to suggestions. On Thu, Sep 12, 2013 at 7:19 AM, Edward Capriolo edlinuxg...@gmail.comwrote: I was reading the horton-works blog and found an interesting article. http://hortonworks.com/blog/stinger-phase-2-the-journey-to-100x-faster-hive/#comment-160753 There is a very interesting graphic which attempts to demonstrate lines of code in the 12 release. http://hortonworks.com/wp-content/uploads/2013/09/hive4.png Although I do not know how they are calculated, they are probably counting code generated by tests output, but besides that they are wrong. One claim is that Cloudera contributed 4,244 lines of code. So to debunk that claim: In https://issues.apache.org/jira/browse/HIVE-4675 Brock Noland from cloudera, created the ptest2 testing framework. He did all the work for ptest2 in hive 12, and it is clearly more then 4,244 This consists of 84 java files [edward@desksandra ptest2]$ find . -name *.java | wc -l 84 and by itself is 8001 lines of code. [edward@desksandra ptest2]$ find . -name *.java | xargs cat | wc -l 8001 [edward@desksandra hive-trunk]$ wc -l HIVE-4675.patch 7902 HIVE-4675.patch This is not the only feature from cloudera in hive 12. There is also a section of the article that talks of a ROAD MAP for hive features. I did not know we (hive) had a road map. I have advocated switching to feature based release and having a road map before, but it was suggested that might limit people from itch-scratching. -- Carter Shanklin Director, Product Management Hortonworks (M): +1.650.644.8795 (T): @cshanklin http://twitter.com/cshanklin -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5138) Streaming - Web HCat API
[ https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769784#comment-13769784 ] Eugene Koifman commented on HIVE-5138: -- OK, makes sense. It would be useful to add some javadoc about concurrency (or rather why it's not an issue) Streaming - Web HCat API - Key: HIVE-5138 URL: https://issues.apache.org/jira/browse/HIVE-5138 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, WebHCat Reporter: Roshan Naik Assignee: Roshan Naik Attachments: HIVE-4196.v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5285) Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
[ https://issues.apache.org/jira/browse/HIVE-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769746#comment-13769746 ] Hudson commented on HIVE-5285: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #102 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/102/]) HIVE-5285 : Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.(Hari Sankar via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524067) * /hive/trunk/ql/src/test/org/apache/hadoop/hive/serde2/CustomSerDe3.java * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat17.q * /hive/trunk/ql/src/test/results/clientpositive/partition_wise_fileformat17.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors. --- Key: HIVE-5285 URL: https://issues.apache.org/jira/browse/HIVE-5285 Project: Hive Issue Type: Bug Affects Versions: 0.11.1 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Fix For: 0.13.0 Attachments: HIVE-5285.1.patch.txt, HIVE-5285.2.patch.txt The approach for HIVE-5199 fix is correct.However, the fix for HIVE-5199 is incomplete. Consider a complex nested structure containing the following object inspector hierarchy: SettableStructObjectInspector { ListObjectInspectorNonSettableStructObjectInspector } In the above case, the cast exception can happen via MapOperator/FetchOperator as below: java.io.IOException: java.lang.ClassCastException: com.skype.data.hadoop.hive.proto.CustomObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:545) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1412) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassCastException: com.skype.data.whaleshark.hadoop.hive.proto.ProtoMapObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:144) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:294) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:138) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$ListConverter.convert(ObjectInspectorConverters.java:251) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:316) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:529) ... 13 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables
[ https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769727#comment-13769727 ] Edward Capriolo commented on HIVE-4568: --- Sorry I have not had time to review this. I am not a good person to do this ATM because I am slightly clueless as to how beeline works. The code looks clean, but I would need to understand a bit more before I give it a +1. Beeline needs to support resolving variables Key: HIVE-4568 URL: https://issues.apache.org/jira/browse/HIVE-4568 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4568-1.patch, HIVE-4568-2.patch, HIVE-4568.3.patch, HIVE-4568.4.patch, HIVE-4568.5.patch, HIVE-4568.6.patch, HIVE-4568.7.patch, HIVE-4568.patch Previous Hive CLI allows user to specify hive variables at the command line using option --hivevar. In user's script, reference to a hive variable will be substituted with the value of the variable. In such way, user can parameterize his/her script and invoke the script with different hive variable values. The following script is one usage: {code} hive --hivevar INPUT=/user/jenkins/oozie.1371538916178/examples/input-data/table --hivevar OUTPUT=/user/jenkins/oozie.1371538916178/examples/output-data/hive -f script.q {code} script.q makes use of hive variables: {code} CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}'; INSERT OVERWRITE DIRECTORY '${OUTPUT}' SELECT * FROM test; {code} However, after upgrade to hiveserver2 and beeline, this functionality is missing. Beeline doesn't take --hivevar option, and any hive variable isn't passed to server so it cannot be used for substitution. This JIRA is to address this issue, providing a backward compatible behavior at Beeline. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-9.patch bq. o Should this include e2e tests in addition (or instead of unit tests). If (when Hadoop changes the log file format this will break, but Unit tests won't catch this since the data that the tests parse is static. There are e2e test cases inside a separate ticket: HIVE-5078 bq. Here is a bunch of little things/nits: bq. o Server.java has “ if (enablelog == true !TempletonUtils.isset(statusdir)) throw new BadParam(enablelog is only applicable when statusdir is set);” in 4 different places. Can this be a method? done bq. o What is the purpose of Server#misc()? Should not be there, removed bq. o TempletonControllerJob: import org.apache.hive.hcatalog.templeton.Main; - unused import done bq. oo Line 173 - indentation is off? done bq. oo Line 295 - writer.close() - This writer is connected to System.err. What are the implications of closing this? What if something tries to write to it later? No one after this point is writing to writer. We opened writer, so we need to close it in our code. bq. o TempletonUtils has unused imports - checkstyle needs to be run on the whole patch. done bq. o TestJobIDParser mixes JUnit3 and JUnit4. It should either not extend TestCase (I vote for this) or not use @Test annotations one bq. o Can JobIDParser (and all subclasses) be made package scoped since they are not used outside templeton pacakge? Similarly, can methods be made as private as possible? done bq. o JobIDParser#parseJobID() has “fname” param which is not used. What is the intent? Should it be used in openStatusFile() call? If not, better to remove it. we shall use it in openStatusFile(). Fixed. bq. o JobIDParser#openStatusFile() creas a Reader. Where/when is it being closed? should close in parseJobID. Fixed. bq. o Could the 2 member variables in JobIDParser be made private (even final)? I can make them protected, but since they will be used in subclass, so I cannot make them private/final bq. o Why is TestJobIDParser using findJobID() directly? Could it not use parseJobID()? Because parseJobID hardcoded with the standard output file for that parser, which is stderr in current directory. In the test, I want to override it to test the input file in the test directory bq. o Can JobIDParser have 1 line of class level javadoc about the purpose of this class? done [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 11334: HIVE-4568 Beeline needs to support resolving variables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11334/#review26181 --- beeline/src/test/org/apache/hive/beeline/src/test/TestBeeLineWithArgs.java https://reviews.apache.org/r/11334/#comment51141 This approach of setting the arguments going to be hard to read and maintain. Can you do something like this ? - replace the use of private final String[] args with a function ListString getBaseArgs(String jdbcUrl); Then add (-f, scriptFileName) to the list it returns ? Similarly add params to the list in testBeelineCommandLineHiveVariable ? Everything else looks good. - Thejas Nair On Sept. 10, 2013, 9:45 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11334/ --- (Updated Sept. 10, 2013, 9:45 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-4568 https://issues.apache.org/jira/browse/HIVE-4568 Repository: hive-git Description --- 1. Added command variable substition 2. Added test case Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java 4c6eb9b beeline/src/java/org/apache/hive/beeline/BeeLine.properties b6650cf beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 61bdeee beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java c70003d beeline/src/test/org/apache/hive/beeline/src/test/TestBeeLineWithArgs.java 4280449 Diff: https://reviews.apache.org/r/11334/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables
[ https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769798#comment-13769798 ] Thejas M Nair commented on HIVE-4568: - Xuefu, I have added a comment on review board. Beeline needs to support resolving variables Key: HIVE-4568 URL: https://issues.apache.org/jira/browse/HIVE-4568 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4568-1.patch, HIVE-4568-2.patch, HIVE-4568.3.patch, HIVE-4568.4.patch, HIVE-4568.5.patch, HIVE-4568.6.patch, HIVE-4568.7.patch, HIVE-4568.patch Previous Hive CLI allows user to specify hive variables at the command line using option --hivevar. In user's script, reference to a hive variable will be substituted with the value of the variable. In such way, user can parameterize his/her script and invoke the script with different hive variable values. The following script is one usage: {code} hive --hivevar INPUT=/user/jenkins/oozie.1371538916178/examples/input-data/table --hivevar OUTPUT=/user/jenkins/oozie.1371538916178/examples/output-data/hive -f script.q {code} script.q makes use of hive variables: {code} CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}'; INSERT OVERWRITE DIRECTORY '${OUTPUT}' SELECT * FROM test; {code} However, after upgrade to hiveserver2 and beeline, this functionality is missing. Beeline doesn't take --hivevar option, and any hive variable isn't passed to server so it cannot be used for substitution. This JIRA is to address this issue, providing a backward compatible behavior at Beeline. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5070) Need to implement listLocatedStatus() in ProxyFileSystem
[ https://issues.apache.org/jira/browse/HIVE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shanyu zhao updated HIVE-5070: -- Fix Version/s: (was: 0.11.1) 0.13.0 Affects Version/s: (was: 0.11.0) 0.12.0 Status: Patch Available (was: Open) Need to implement listLocatedStatus() in ProxyFileSystem Key: HIVE-5070 URL: https://issues.apache.org/jira/browse/HIVE-5070 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.12.0 Reporter: shanyu zhao Fix For: 0.13.0 Attachments: HIVE-5070.patch.txt, HIVE-5070-v2.patch MAPREDUCE-1981 introduced a new API for FileSystem - listLocatedStatus. It is used in Hadoop's FileInputFormat.getSplits(). Hive's ProxyFileSystem class needs to implement this API in order to make Hive unit test work. Otherwise, you'll see these exceptions when running TestCliDriver test case, e.g. results of running allcolref_in_udf.q: [junit] Running org.apache.hadoop.hive.cli.TestCliDriver [junit] Begin query: allcolref_in_udf.q [junit] java.lang.IllegalArgumentException: Wrong FS: pfile:/GitHub/Monarch/project/hive-monarch/build/ql/test/data/warehouse/src, expected: file:/// [junit] at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:642) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:69) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:375) [junit] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1482) [junit] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1522) [junit] at org.apache.hadoop.fs.FileSystem$4.init(FileSystem.java:1798) [junit] at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1797) [junit] at org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:579) [junit] at org.apache.hadoop.fs.FilterFileSystem.listLocatedStatus(FilterFileSystem.java:235) [junit] at org.apache.hadoop.fs.FilterFileSystem.listLocatedStatus(FilterFileSystem.java:235) [junit] at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264) [junit] at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217) [junit] at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) [junit] at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385) [junit] at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351) [junit] at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:503) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:495) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:390) [junit] at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) [junit] at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1481) [junit] at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) [junit] at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) [junit] at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:552) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1481) [junit] at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:552) [junit] at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:543) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:688) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597)
Re: Interesting claims that seem untrue
Whatever you count, you get more of :) On Tue, Sep 17, 2013 at 1:57 PM, Konstantin Boudnik c...@apache.org wrote: Carter, what you are doing is essentially contradict ASF policy of community over code. Perhaps, your intentions are good. However, LOC calculations or other silly contests are essentially driving a wedge between developers who happen to draw their paycheck from different commercial entities. Hadoop community passed through this already and it caused nothing but despair and bitterness between the people. Unlike some other popular contests, the number of lines contributed doesn't matter for most. Seriously. Regards, Cos On Mon, Sep 16, 2013 at 01:58PM, Carter Shanklin wrote: Ed, If nothing else I'm glad it was interesting enough to generate some discussion. These sorts of stats are always subjects of a lot of controversy. I have seen a lot of these sorts of charts float around in confidential slide decks and I think it's good to have them out in the open where anyone can critique and correct them. In this case Ed, you've pointed out a legitimate flaw in my analysis. Doing the analysis again I found that previously, due to a bug in my scripts, JIRAs that didn't have Hudson comments in them were not counted (this was one way it was identifying SVN commit IDs which I have since removed due to flakiness). Brock's patch was the single largest victim of this bug but not the only one, there were some from Cloudera, NexR, Hortonworks, Facebook even 2 from you Ed. The interested can see a full list of exclusions here: https://docs.google.com/spreadsheet/ccc?key=0ArmXd5zzNQm5dDJTMkFtaUk2d0dyU3hnWGJCcUczbXc#gid=0 . I apologize to those under-represented, there wasn't any intent on my part to minimize anyone's work. The impact in final totals is Cloudera +5.4%, NexR +0.8%, Facebook -2.7%, Hortonworks -3.3%. I will be updating the blog later today with relevant corrections. There is going to be continued interest in seeing charts like these, for example when Hive 12 is officially done. Sanjay suggested that LoC counts may not be the best way to represent true contribution. I agree that not all lines of code are created equal, for example a few monster patches recently went in re-arranging HCatalog namespaces and I think also indentation style. This (hopefully) mechanical work is not on the same footing as adding new query language features. Still it is work and wouldn't be fair to pretend it didn't happen. If anyone has ideas on better ways to fairly capture contribution I'm open to suggestions. On Thu, Sep 12, 2013 at 7:19 AM, Edward Capriolo edlinuxg...@gmail.com wrote: I was reading the horton-works blog and found an interesting article. http://hortonworks.com/blog/stinger-phase-2-the-journey-to-100x-faster-hive/#comment-160753 There is a very interesting graphic which attempts to demonstrate lines of code in the 12 release. http://hortonworks.com/wp-content/uploads/2013/09/hive4.png Although I do not know how they are calculated, they are probably counting code generated by tests output, but besides that they are wrong. One claim is that Cloudera contributed 4,244 lines of code. So to debunk that claim: In https://issues.apache.org/jira/browse/HIVE-4675 Brock Noland from cloudera, created the ptest2 testing framework. He did all the work for ptest2 in hive 12, and it is clearly more then 4,244 This consists of 84 java files [edward@desksandra ptest2]$ find . -name *.java | wc -l 84 and by itself is 8001 lines of code. [edward@desksandra ptest2]$ find . -name *.java | xargs cat | wc -l 8001 [edward@desksandra hive-trunk]$ wc -l HIVE-4675.patch 7902 HIVE-4675.patch This is not the only feature from cloudera in hive 12. There is also a section of the article that talks of a ROAD MAP for hive features. I did not know we (hive) had a road map. I have advocated switching to feature based release and having a road map before, but it was suggested that might limit people from itch-scratching. -- Carter Shanklin Director, Product Management Hortonworks (M): +1.650.644.8795 (T): @cshanklin http://twitter.com/cshanklin -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769712#comment-13769712 ] Prasanth J commented on HIVE-4113: -- HIVE-4340 will expose ORC stats through reader interfaces which can be used for optimizing count(*). Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables
[ https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769718#comment-13769718 ] Xuefu Zhang commented on HIVE-4568: --- [~thejas] [~appodictic] [~ashutoshc] I'm wondering if any of you have cycle to review the patch. It has been pending for quite some time. Let me know if you have any questions. Thanks. Beeline needs to support resolving variables Key: HIVE-4568 URL: https://issues.apache.org/jira/browse/HIVE-4568 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4568-1.patch, HIVE-4568-2.patch, HIVE-4568.3.patch, HIVE-4568.4.patch, HIVE-4568.5.patch, HIVE-4568.6.patch, HIVE-4568.7.patch, HIVE-4568.patch Previous Hive CLI allows user to specify hive variables at the command line using option --hivevar. In user's script, reference to a hive variable will be substituted with the value of the variable. In such way, user can parameterize his/her script and invoke the script with different hive variable values. The following script is one usage: {code} hive --hivevar INPUT=/user/jenkins/oozie.1371538916178/examples/input-data/table --hivevar OUTPUT=/user/jenkins/oozie.1371538916178/examples/output-data/hive -f script.q {code} script.q makes use of hive variables: {code} CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}'; INSERT OVERWRITE DIRECTORY '${OUTPUT}' SELECT * FROM test; {code} However, after upgrade to hiveserver2 and beeline, this functionality is missing. Beeline doesn't take --hivevar option, and any hive variable isn't passed to server so it cannot be used for substitution. This JIRA is to address this issue, providing a backward compatible behavior at Beeline. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-4531: - Component/s: WebHCat [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769716#comment-13769716 ] Prasanth J commented on HIVE-4113: -- Sorry. Please ignore that comment. Row count interface already exists in ORC reader. HIVE-4340 is not relevant for this JIRA. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4961: -- Attachment: HIVE-4961.4-vectorization.patch Refactor packages per request from Ashutosh. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, HIVE-4961.4-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5156) HiveServer2 jdbc ResultSet.close should free up resources on server side
[ https://issues.apache.org/jira/browse/HIVE-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-5156: --- Attachment: HIVE-5156.D12837.3.patch HiveServer2 jdbc ResultSet.close should free up resources on server side Key: HIVE-5156 URL: https://issues.apache.org/jira/browse/HIVE-5156 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Attachments: HIVE-5156.D12837.3.patch ResultSet.close does not free up any resources (tmp files etc) on hive server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5206) Support parameterized primitive types
[ https://issues.apache.org/jira/browse/HIVE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769889#comment-13769889 ] Thejas M Nair commented on HIVE-5206: - Patch committed to 0.12 branch Support parameterized primitive types - Key: HIVE-5206 URL: https://issues.apache.org/jira/browse/HIVE-5206 Project: Hive Issue Type: Improvement Components: Types Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: HIVE-5206.1.patch, HIVE-5206.2.patch, HIVE-5206.3.patch, HIVE-5206.4.patch, HIVE-5206.D12693.1.patch, HIVE-5206.v12.1.patch Support for parameterized types is needed for char/varchar/decimal support. This adds a type parameters value to the PrimitiveTypeEntry/PrimitiveTypeInfo/PrimitiveObjectInspector objects. NO PRECOMMIT TESTS - dependent on HIVE-5203/HIVE-5204 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5206) Support parameterized primitive types
[ https://issues.apache.org/jira/browse/HIVE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5206: Fix Version/s: (was: 0.13.0) 0.12.0 Support parameterized primitive types - Key: HIVE-5206 URL: https://issues.apache.org/jira/browse/HIVE-5206 Project: Hive Issue Type: Improvement Components: Types Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: HIVE-5206.1.patch, HIVE-5206.2.patch, HIVE-5206.3.patch, HIVE-5206.4.patch, HIVE-5206.D12693.1.patch, HIVE-5206.v12.1.patch Support for parameterized types is needed for char/varchar/decimal support. This adds a type parameters value to the PrimitiveTypeEntry/PrimitiveTypeInfo/PrimitiveObjectInspector objects. NO PRECOMMIT TESTS - dependent on HIVE-5203/HIVE-5204 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5278) Move some string UDFs to GenericUDFs, for better varchar support
[ https://issues.apache.org/jira/browse/HIVE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769892#comment-13769892 ] Thejas M Nair commented on HIVE-5278: - Patch committed to 0.12 branch. Move some string UDFs to GenericUDFs, for better varchar support Key: HIVE-5278 URL: https://issues.apache.org/jira/browse/HIVE-5278 Project: Hive Issue Type: Improvement Components: Types, UDF Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: D12909.1.patch, HIVE-5278.1.patch, HIVE-5278.2.patch, HIVE-5278.v12.1.patch To better support varchar/char types in string UDFs, select UDFs should be converted to GenericUDFs. This allows the UDF to return the resulting char/varchar length in the type metadata. This work is being split off as a separate task from HIVE-4844. The initial UDFs as part of this work are concat/lower/upper. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5278) Move some string UDFs to GenericUDFs, for better varchar support
[ https://issues.apache.org/jira/browse/HIVE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5278: Fix Version/s: (was: 0.13.0) 0.12.0 Move some string UDFs to GenericUDFs, for better varchar support Key: HIVE-5278 URL: https://issues.apache.org/jira/browse/HIVE-5278 Project: Hive Issue Type: Improvement Components: Types, UDF Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: D12909.1.patch, HIVE-5278.1.patch, HIVE-5278.2.patch, HIVE-5278.v12.1.patch To better support varchar/char types in string UDFs, select UDFs should be converted to GenericUDFs. This allows the UDF to return the resulting char/varchar length in the type metadata. This work is being split off as a separate task from HIVE-4844. The initial UDFs as part of this work are concat/lower/upper. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5161) Additional SerDe support for varchar type
[ https://issues.apache.org/jira/browse/HIVE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5161: Fix Version/s: (was: 0.13.0) 0.12.0 Additional SerDe support for varchar type - Key: HIVE-5161 URL: https://issues.apache.org/jira/browse/HIVE-5161 Project: Hive Issue Type: Bug Components: Serializers/Deserializers, Types Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: D12897.1.patch, HIVE-5161.1.patch, HIVE-5161.2.patch, HIVE-5161.3.patch, HIVE-5161.v12.1.patch Breaking out support for varchar for the various SerDes as an additional task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Interesting claims that seem untrue
Whatever you count, you get more of :) Then let's count lines of documentation! ;) -- Lefty On Tue, Sep 17, 2013 at 12:15 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Whatever you count, you get more of :) On Tue, Sep 17, 2013 at 1:57 PM, Konstantin Boudnik c...@apache.orgwrote: Carter, what you are doing is essentially contradict ASF policy of community over code. Perhaps, your intentions are good. However, LOC calculations or other silly contests are essentially driving a wedge between developers who happen to draw their paycheck from different commercial entities. Hadoop community passed through this already and it caused nothing but despair and bitterness between the people. Unlike some other popular contests, the number of lines contributed doesn't matter for most. Seriously. Regards, Cos On Mon, Sep 16, 2013 at 01:58PM, Carter Shanklin wrote: Ed, If nothing else I'm glad it was interesting enough to generate some discussion. These sorts of stats are always subjects of a lot of controversy. I have seen a lot of these sorts of charts float around in confidential slide decks and I think it's good to have them out in the open where anyone can critique and correct them. In this case Ed, you've pointed out a legitimate flaw in my analysis. Doing the analysis again I found that previously, due to a bug in my scripts, JIRAs that didn't have Hudson comments in them were not counted (this was one way it was identifying SVN commit IDs which I have since removed due to flakiness). Brock's patch was the single largest victim of this bug but not the only one, there were some from Cloudera, NexR, Hortonworks, Facebook even 2 from you Ed. The interested can see a full list of exclusions here: https://docs.google.com/spreadsheet/ccc?key=0ArmXd5zzNQm5dDJTMkFtaUk2d0dyU3hnWGJCcUczbXc#gid=0 . I apologize to those under-represented, there wasn't any intent on my part to minimize anyone's work. The impact in final totals is Cloudera +5.4%, NexR +0.8%, Facebook -2.7%, Hortonworks -3.3%. I will be updating the blog later today with relevant corrections. There is going to be continued interest in seeing charts like these, for example when Hive 12 is officially done. Sanjay suggested that LoC counts may not be the best way to represent true contribution. I agree that not all lines of code are created equal, for example a few monster patches recently went in re-arranging HCatalog namespaces and I think also indentation style. This (hopefully) mechanical work is not on the same footing as adding new query language features. Still it is work and wouldn't be fair to pretend it didn't happen. If anyone has ideas on better ways to fairly capture contribution I'm open to suggestions. On Thu, Sep 12, 2013 at 7:19 AM, Edward Capriolo edlinuxg...@gmail.com wrote: I was reading the horton-works blog and found an interesting article. http://hortonworks.com/blog/stinger-phase-2-the-journey-to-100x-faster-hive/#comment-160753 There is a very interesting graphic which attempts to demonstrate lines of code in the 12 release. http://hortonworks.com/wp-content/uploads/2013/09/hive4.png Although I do not know how they are calculated, they are probably counting code generated by tests output, but besides that they are wrong. One claim is that Cloudera contributed 4,244 lines of code. So to debunk that claim: In https://issues.apache.org/jira/browse/HIVE-4675 Brock Noland from cloudera, created the ptest2 testing framework. He did all the work for ptest2 in hive 12, and it is clearly more then 4,244 This consists of 84 java files [edward@desksandra ptest2]$ find . -name *.java | wc -l 84 and by itself is 8001 lines of code. [edward@desksandra ptest2]$ find . -name *.java | xargs cat | wc -l 8001 [edward@desksandra hive-trunk]$ wc -l HIVE-4675.patch 7902 HIVE-4675.patch This is not the only feature from cloudera in hive 12. There is also a section of the article that talks of a ROAD MAP for hive features. I did not know we (hive) had a road map. I have advocated switching to feature based release and having a road map before, but it was suggested that might limit people from itch-scratching. -- Carter Shanklin Director, Product Management Hortonworks (M): +1.650.644.8795 (T): @cshanklin http://twitter.com/cshanklin -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have
[jira] [Updated] (HIVE-5086) Fix scriptfile1.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5086: - Attachment: HIVE-5086-2.patch Fixed unit test failure. Fix scriptfile1.q on Windows Key: HIVE-5086 URL: https://issues.apache.org/jira/browse/HIVE-5086 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5086-1.patch, HIVE-5086-2.patch Test failed with error message: [junit] Task with the most failures(4): [junit] - [junit] Task ID: [junit] task_20130814023904691_0001_m_00 [junit] [junit] URL: [junit] http://localhost:50030/taskdetails.jsp?jobid=job_20130814023904691_0001tipid=task_20130814023904691_0001_m_00 [junit] - [junit] Diagnostic Messages for this Task: [junit] java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175) [junit] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) [junit] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) [junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) [junit] at org.apache.hadoop.mapred.Child$4.run(Child.java:271) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) [junit] at org.apache.hadoop.mapred.Child.main(Child.java:265) [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:538) [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157) [junit] ... 8 more [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 2]: Unable to initialize custom script. [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:357) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:528) [junit] ... 9 more [junit] Caused by: java.io.IOException: Cannot run program D:\tmp\hadoop-Administrator\mapred\local\3_0\taskTracker\Administrator\jobcache\job_20130814023904691_0001\attempt_20130814023904691_0001_m_00_3\work\.\testgrep: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:316) [junit] ... 18 more [junit] Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessImpl.create(Native Method) [junit] at java.lang.ProcessImpl.init(ProcessImpl.java:81) [junit] at java.lang.ProcessImpl.start(ProcessImpl.java:30) [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) [junit] ... 19 more [junit] [junit] [junit] Exception: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.runTest(TestMinimrCliDriver.java:122) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1(TestMinimrCliDriver.java:104) [junit] at
[jira] [Updated] (HIVE-4844) Add varchar data type
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4844: Fix Version/s: (was: 0.13.0) 0.12.0 Add varchar data type - Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.12.0 Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.12.patch, HIVE-4844.13.patch, HIVE-4844.14.patch, HIVE-4844.15.patch, HIVE-4844.16.patch, HIVE-4844.17.patch, HIVE-4844.18.patch, HIVE-4844.19.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, HIVE-4844.D12699.1.patch, HIVE-4844.D12891.1.patch, HIVE-4844.v12.1.patch, screenshot.png Add new varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Char type will be added as another task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5084) Fix newline.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769914#comment-13769914 ] Thejas M Nair commented on HIVE-5084: - Patch committed to 0.12 branch. Fix newline.q on Windows Key: HIVE-5084 URL: https://issues.apache.org/jira/browse/HIVE-5084 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5084-1.patch Test failed with vague error message: [junit] Error during job, obtaining debugging information... [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 2 hive.log doesn't show something interesting either: 2013-08-14 00:47:29,411 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(723)) - Got ping response for sessionid: 0x1407a49fc1e0003 after 1ms 2013-08-14 00:47:31,391 ERROR exec.Task (SessionState.java:printError(416)) - Execution failed with exit status: 2 2013-08-14 00:47:31,391 ERROR exec.Task (SessionState.java:printError(416)) - Obtaining error information 2013-08-14 00:47:31,392 ERROR exec.Task (SessionState.java:printError(416)) - Task failed! Task ID: Stage-1 Logs: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
[ https://issues.apache.org/jira/browse/HIVE-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769919#comment-13769919 ] Ashutosh Chauhan commented on HIVE-5301: [~prasadm] Can you create RB or phabricator link for this? Add a schema tool for offline metastore schema upgrade -- Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5301.1.patch, HIVE-5301-with-HIVE-3764.0.patch HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 14169: HIVE-3764: Support metastore version consistency check
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14169/#review26182 --- Mostly looks good. Some comments. metastore/scripts/upgrade/derby/hive-schema-0.12.0.derby.sql https://reviews.apache.org/r/14169/#comment51142 Name 'comment' has caused problems previously. I will suggest to name it VERSION_COMMENT, VCOMMENT or any other variation of it. metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java https://reviews.apache.org/r/14169/#comment51143 Looks like this line can be removed. metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java https://reviews.apache.org/r/14169/#comment51144 typo metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java https://reviews.apache.org/r/14169/#comment51145 Can you name this variable version. I got confused thinking curVersion implies current version of jars (which was incorrect) metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java https://reviews.apache.org/r/14169/#comment51146 Will be good to do currVersion.trim() here. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java https://reviews.apache.org/r/14169/#comment51147 Can you add a comment why we need to do a recheck? Seems like its not necessary. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java https://reviews.apache.org/r/14169/#comment51148 Should this be if(strictValidation ... ) - Ashutosh Chauhan On Sept. 17, 2013, 6:13 a.m., Prasad Mujumdar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14169/ --- (Updated Sept. 17, 2013, 6:13 a.m.) Review request for hive, Ashutosh Chauhan and Brock Noland. Bugs: HIVE-3764 https://issues.apache.org/jira/browse/HIVE-3764 Repository: hive-git Description --- This is a 0.12 specific patch. The trunk patch will include additional metastore scripts which I will attach separately to the ticket. - Added a new table in the metastore schema to store the Hive version in the metastore. - Metastore handler compare the version stored in the schema with its own version. If there's a mismatch, then it can either record the correct version or raise error. The behavior is configurable via a new Hive config. This config when set, also restrict dataNucleus to auto upgrade the schema. - The new schema creation and upgrade scripts record the new version in the metastore version table. - Added 0.12 upgrade scripts for all supported DBs to creates the new table version tables in 0.12 metastore schema The current patch has the verification turned off by default. I would prefer to keep it enabled, though it require any add-hoc setup to explicitly disable it (or create the metastore schema by running scripts). The default can be changed or left as is as per the consensus. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22149e4 conf/hive-default.xml.template 9a3fc1d metastore/scripts/upgrade/derby/014-HIVE-3764.derby.sql PRE-CREATION metastore/scripts/upgrade/derby/hive-schema-0.12.0.derby.sql cce544f metastore/scripts/upgrade/derby/upgrade-0.10.0-to-0.11.0.derby.sql cae7936 metastore/scripts/upgrade/derby/upgrade-0.11.0-to-0.12.0.derby.sql 492cc93 metastore/scripts/upgrade/derby/upgrade.order.derby PRE-CREATION metastore/scripts/upgrade/mysql/014-HIVE-3764.mysql.sql PRE-CREATION metastore/scripts/upgrade/mysql/hive-schema-0.12.0.mysql.sql 22a77fe metastore/scripts/upgrade/mysql/upgrade-0.11.0-to-0.12.0.mysql.sql 375a05f metastore/scripts/upgrade/mysql/upgrade.order.mysql PRE-CREATION metastore/scripts/upgrade/oracle/014-HIVE-3764.oracle.sql PRE-CREATION metastore/scripts/upgrade/oracle/hive-schema-0.12.0.oracle.sql 85a0178 metastore/scripts/upgrade/oracle/upgrade-0.10.0-to-0.11.0.mysql.sql PRE-CREATION metastore/scripts/upgrade/oracle/upgrade-0.11.0-to-0.12.0.oracle.sql a2d0901 metastore/scripts/upgrade/oracle/upgrade.order.oracle PRE-CREATION metastore/scripts/upgrade/postgres/014-HIVE-3764.postgres.sql PRE-CREATION metastore/scripts/upgrade/postgres/hive-schema-0.12.0.postgres.sql 7b319ba metastore/scripts/upgrade/postgres/upgrade-0.11.0-to-0.12.0.postgres.sql 9da0a1b metastore/scripts/upgrade/postgres/upgrade.order.postgres PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 39dda92 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java a27243d
[jira] [Updated] (HIVE-5084) Fix newline.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5084: Fix Version/s: (was: 0.13.0) 0.12.0 Fix newline.q on Windows Key: HIVE-5084 URL: https://issues.apache.org/jira/browse/HIVE-5084 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5084-1.patch Test failed with vague error message: [junit] Error during job, obtaining debugging information... [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 2 hive.log doesn't show something interesting either: 2013-08-14 00:47:29,411 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(723)) - Got ping response for sessionid: 0x1407a49fc1e0003 after 1ms 2013-08-14 00:47:31,391 ERROR exec.Task (SessionState.java:printError(416)) - Execution failed with exit status: 2 2013-08-14 00:47:31,391 ERROR exec.Task (SessionState.java:printError(416)) - Obtaining error information 2013-08-14 00:47:31,392 ERROR exec.Task (SessionState.java:printError(416)) - Task failed! Task ID: Stage-1 Logs: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5304) JDO and SQL filters can both return different results for string compares depending on underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769931#comment-13769931 ] Sergey Shelukhin commented on HIVE-5304: Actually, all names - I see show indexes order also changes in some queries, etc. JDO and SQL filters can both return different results for string compares depending on underlying datastore --- Key: HIVE-5304 URL: https://issues.apache.org/jira/browse/HIVE-5304 Project: Hive Issue Type: Bug Components: Metastore Reporter: Sergey Shelukhin Hive uses JDOQL filters to optimize partition retrieval; recently direct SQL was added to optimize it further. Both of these methods may end up pushing StringCol op 'SomeString' to underlying SQL datastore. Many paths also pushes order by-s, although these are not as problematic. The problem is that different datastores handle string compares differently. While testing on Postgres, I see that results in different things, from innocent like order changes in show partitions, to more serious like {code} alter table ptestfilter drop partition (c='US', d='2') {code} in drop_partitions_filter.q - in Derby, with which the .q.out file was generated, it drops c=Uganda/d=2; this also passes on MySQL (I ran tests with autocreated db); on Postgres with a db from the script it doesn't. Looks like we need to enforce collation in partition names and part_key_values-es; both in the create scripts, as well as during autocreate (via package.jdo?) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira