[jira] [Resolved] (HIVE-3201) Thrift build target should clean generated source directories
[ https://issues.apache.org/jira/browse/HIVE-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-3201. -- Resolution: Invalid This ticket was made irrelevant by the switch to Maven. Thrift build target should clean generated source directories - Key: HIVE-3201 URL: https://issues.apache.org/jira/browse/HIVE-3201 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2672) CLI fails to start when run on Hadoop 0.23.0
[ https://issues.apache.org/jira/browse/HIVE-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2672. -- Resolution: Won't Fix No longer relevant. CLI fails to start when run on Hadoop 0.23.0 Key: HIVE-2672 URL: https://issues.apache.org/jira/browse/HIVE-2672 Project: Hive Issue Type: Bug Components: CLI, Shims Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-3150) Ivy checkModified setting should be configurable
[ https://issues.apache.org/jira/browse/HIVE-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-3150. -- Resolution: Won't Fix Mavenization made this ticket irrelevant. Ivy checkModified setting should be configurable Key: HIVE-3150 URL: https://issues.apache.org/jira/browse/HIVE-3150 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2947) Arc test -- IGNORE
[ https://issues.apache.org/jira/browse/HIVE-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2947. -- Resolution: Invalid Arc test -- IGNORE -- Key: HIVE-2947 URL: https://issues.apache.org/jira/browse/HIVE-2947 Project: Hive Issue Type: Bug Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2878) TestHBaseMinimrCliDriver hbase_bulk.m fails on 0.23
[ https://issues.apache.org/jira/browse/HIVE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2878. -- Resolution: Won't Fix TestHBaseMinimrCliDriver hbase_bulk.m fails on 0.23 --- Key: HIVE-2878 URL: https://issues.apache.org/jira/browse/HIVE-2878 Project: Hive Issue Type: Bug Components: HBase Handler, Tests Affects Versions: 0.8.1 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-2878.br08.1.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2877) TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23
[ https://issues.apache.org/jira/browse/HIVE-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2877. -- Resolution: Won't Fix TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23 -- Key: HIVE-2877 URL: https://issues.apache.org/jira/browse/HIVE-2877 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2697) Ant compile-test target should be triggered from subprojects, not from top-level targets
[ https://issues.apache.org/jira/browse/HIVE-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2697. -- Resolution: Won't Fix Ant compile-test target should be triggered from subprojects, not from top-level targets Key: HIVE-2697 URL: https://issues.apache.org/jira/browse/HIVE-2697 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.8.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-2697.1.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-1906) Fix Eclipse classpath and add Eclipse launch configurations for HiveServer and MetaStoreServer
[ https://issues.apache.org/jira/browse/HIVE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1906. -- Resolution: Won't Fix Fix Eclipse classpath and add Eclipse launch configurations for HiveServer and MetaStoreServer -- Key: HIVE-1906 URL: https://issues.apache.org/jira/browse/HIVE-1906 Project: Hive Issue Type: Task Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-1906.1.patch.txt, HIVE-1906.2.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2437) update project website navigation links
[ https://issues.apache.org/jira/browse/HIVE-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2437. -- Resolution: Fixed update project website navigation links --- Key: HIVE-2437 URL: https://issues.apache.org/jira/browse/HIVE-2437 Project: Hive Issue Type: Sub-task Reporter: John Sichi http://www.apache.org/foundation/marks/pmcs.html#navigation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-2896) Ant eclipse target should trigger generation of CliDriver tests
[ https://issues.apache.org/jira/browse/HIVE-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2896. -- Resolution: Won't Fix Ant eclipse target should trigger generation of CliDriver tests --- Key: HIVE-2896 URL: https://issues.apache.org/jira/browse/HIVE-2896 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-998) Cleanup UDF testcases
[ https://issues.apache.org/jira/browse/HIVE-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-998: Assignee: (was: Carl Steinbach) Cleanup UDF testcases - Key: HIVE-998 URL: https://issues.apache.org/jira/browse/HIVE-998 Project: Hive Issue Type: Test Reporter: Carl Steinbach * For every UDF x there should be a corresponding udf_x.q testcase. * Every udf_x.q file should begin with DESCRIBE FUNCTION X, and DESCRIBE EXTENDED FUNCTION X. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8933) Check release builds for SNAPSHOT dependencies
[ https://issues.apache.org/jira/browse/HIVE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-8933: - Resolution: Fixed Fix Version/s: 0.14.1 0.15.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Alan, thanks for the review. Committed to trunk and branch-0.14. Check release builds for SNAPSHOT dependencies -- Key: HIVE-8933 URL: https://issues.apache.org/jira/browse/HIVE-8933 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.15.0, 0.14.1 Attachments: HIVE-8933.1.patch.txt Hive 0.14.0 was released with SNAPSHOT dependencies. We should use the maven enforcer plugin to prevent this from happening again in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8845) Switch to Tez 0.5.2
[ https://issues.apache.org/jira/browse/HIVE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-8845: - Affects Version/s: 0.14.0 Switch to Tez 0.5.2 --- Key: HIVE-8845 URL: https://issues.apache.org/jira/browse/HIVE-8845 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.14.1 Attachments: HIVE-8845.1.patch Tez 0.5.2 has been released, we should switch our pom to that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8873) Switch to calcite 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-8873: - Affects Version/s: 0.14.0 Switch to calcite 0.9.2 --- Key: HIVE-8873 URL: https://issues.apache.org/jira/browse/HIVE-8873 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.14.1 Attachments: HIVE-8873.1.patch Calcite release 0.9.2 is out. We should update pom to use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8933) Check release builds for SNAPSHOT dependencies
Carl Steinbach created HIVE-8933: Summary: Check release builds for SNAPSHOT dependencies Key: HIVE-8933 URL: https://issues.apache.org/jira/browse/HIVE-8933 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Hive 0.14.0 was released with SNAPSHOT dependencies. We should use the maven enforcer plugin to prevent this from happening again in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8933) Check release builds for SNAPSHOT dependencies
[ https://issues.apache.org/jira/browse/HIVE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-8933: - Attachment: HIVE-8933.1.patch.txt Check release builds for SNAPSHOT dependencies -- Key: HIVE-8933 URL: https://issues.apache.org/jira/browse/HIVE-8933 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-8933.1.patch.txt Hive 0.14.0 was released with SNAPSHOT dependencies. We should use the maven enforcer plugin to prevent this from happening again in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8933) Check release builds for SNAPSHOT dependencies
[ https://issues.apache.org/jira/browse/HIVE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-8933: - Status: Patch Available (was: Open) RB: https://reviews.apache.org/r/28308/ [~alangates], would you mind reviewing this? Thanks! Check release builds for SNAPSHOT dependencies -- Key: HIVE-8933 URL: https://issues.apache.org/jira/browse/HIVE-8933 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-8933.1.patch.txt Hive 0.14.0 was released with SNAPSHOT dependencies. We should use the maven enforcer plugin to prevent this from happening again in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8906) Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT artifacts
Carl Steinbach created HIVE-8906: Summary: Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT artifacts Key: HIVE-8906 URL: https://issues.apache.org/jira/browse/HIVE-8906 Project: Hive Issue Type: Bug Reporter: Carl Steinbach The Hive 0.14.0 release depends on SNAPSHOT versions of tez-0.5.2 and calcite-0.9.2. I believe this violates Apache release policy (can't find the reference, but I seem to remember this being a problem with HCatalog before the merger), and it implies that the folks who tested the release weren't necessarily testing the same thing. It also means that people who try to build Hive using the 0.14.0 src release will encounter errors unless they configure Maven to pull artifacts from the snapshot repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2390) Add UNIONTYPE serialization support to LazyBinarySerDe
[ https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2390: - Summary: Add UNIONTYPE serialization support to LazyBinarySerDe (was: Expand support for union types) Add UNIONTYPE serialization support to LazyBinarySerDe -- Key: HIVE-2390 URL: https://issues.apache.org/jira/browse/HIVE-2390 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Jakob Homan Assignee: Suma Shivaprasad Labels: TODOC14, uniontype Fix For: 0.14.0 Attachments: HIVE-2390.1.patch, HIVE-2390.patch When the union type was introduced, full support for it wasn't provided. For instance, when working with a union that gets passed to LazyBinarySerde: {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2390) Add UNIONTYPE serialization support to LazyBinarySerDe
[ https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128124#comment-14128124 ] Carl Steinbach commented on HIVE-2390: -- I updated the description of this ticket to accurately reflect the change that was made in this patch. My impression is that this patch doesn't really change the situation in Hive with respect to UNIONTYPEs -- this feature is still unusable. If I'm wrong about this I would appreciate someone setting me straight. Add UNIONTYPE serialization support to LazyBinarySerDe -- Key: HIVE-2390 URL: https://issues.apache.org/jira/browse/HIVE-2390 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Jakob Homan Assignee: Suma Shivaprasad Labels: TODOC14, uniontype Fix For: 0.14.0 Attachments: HIVE-2390.1.patch, HIVE-2390.patch When the union type was introduced, full support for it wasn't provided. For instance, when working with a union that gets passed to LazyBinarySerde: {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106098#comment-14106098 ] Carl Steinbach commented on HIVE-4629: -- bq. I tried posting this on RB but it went down. Thank you very much for removing the thrift enum compatibility problem! I had another comment with regards to the method signature which I think I did not explain well. I think the new method should be... [~brocknoland], I totally agree with this, but I didn't see this in the patch. Are you referring to something in the Thrift IDL file or something else? HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Dong Chen Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, HIVE-4629.5.patch, HIVE-4629.6.patch, HIVE-4629.7.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7457) Minor HCatalog Pig Adapter test clean up
[ https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7457: - Status: Open (was: Patch Available) [~davidzchen], the TestHCatLoader test failure needs some investigation. Also, can you please summarize the changes you made to TestOrcHCatPigStorer/TestOrcHCatStorerMulti/TestOrcHCatStorer? I'm not sure RB is comparing the right files in the diff. Thanks. Minor HCatalog Pig Adapter test clean up Key: HIVE-7457 URL: https://issues.apache.org/jira/browse/HIVE-7457 Project: Hive Issue Type: Sub-task Reporter: David Chen Assignee: David Chen Priority: Minor Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, HIVE-7457.4.patch Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420: * Run through Hive Eclipse formatter. * Convert JUnit 3-style tests to follow JUnit 4 conventions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
[ https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7420: - Status: Open (was: Patch Available) Hi [~davidzchen], I left some comments on reviewboard, but also want to mention a couple issues here on JIRA. Please excuse the bullet points: * The automated tests flagged failures in TestHCatLoader, TestHCatLoaderComplexSchema, TestHCatStorer, and TestHCatStorerMulti. We can't commit this patch until these test failures are resolved. * This patch adds support for disabling all of the test methods in a particular testcase on a per-storageformat basis, but I think we really need the ability to disable individual test methods on a per-storageformat basis. It should be possible to do this by maintaining a Mapfunction_name, Listdisabled_test_methods and using reflection to find the name of test method. * For parameterized tests that fail, JUnit reports the parameter index, but not the actual parameter values (see report above). The latter would be a lot nicer. There's a [JUnitParams|https://github.com/Pragmatists/junitparams] project on Github that doesn't suck (according to the author), and which I think may report the actual parameter values as opposed to the parameter indexes. If you have time please take a look at this and see if it's worth using here. Parameterize tests for HCatalog Pig interfaces for testing against all storage formats -- Key: HIVE-7420 URL: https://issues.apache.org/jira/browse/HIVE-7420 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7420-without-HIVE-7457.2.patch, HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, HIVE-7420.3.patch Currently, HCatalog tests only test against RCFile with a few testing against ORC. The tests should be covering other Hive storage formats as well. HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with all Hive storage formats and with that patch, all test suites built on HCatMapReduceTest are running and passing against Sequence File, Text, and ORC in addition to RCFile. Similar changes should be made to make the tests for HCatLoader and HCatStorer generic so that they can be run against all Hive storage formats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7286) Parameterize HCatMapReduceTest for testing against all Hive storage formats
[ https://issues.apache.org/jira/browse/HIVE-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7286: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks David! Parameterize HCatMapReduceTest for testing against all Hive storage formats --- Key: HIVE-7286 URL: https://issues.apache.org/jira/browse/HIVE-7286 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Fix For: 0.14.0 Attachments: HIVE-7286.1.patch, HIVE-7286.10.patch, HIVE-7286.11.patch, HIVE-7286.2.patch, HIVE-7286.3.patch, HIVE-7286.4.patch, HIVE-7286.5.patch, HIVE-7286.6.patch, HIVE-7286.7.patch, HIVE-7286.8.patch, HIVE-7286.9.patch Currently, HCatMapReduceTest, which is extended by the following test suites: * TestHCatDynamicPartitioned * TestHCatNonPartitioned * TestHCatPartitioned * TestHCatExternalDynamicPartitioned * TestHCatExternalNonPartitioned * TestHCatExternalPartitioned * TestHCatMutableDynamicPartitioned * TestHCatMutableNonPartitioned * TestHCatMutablePartitioned These tests run against RCFile. Currently, only TestHCatDynamicPartitioned is run against any other storage format (ORC). Ideally, HCatalog should be tested against all storage formats supported by Hive. The easiest way to accomplish this is to turn HCatMapReduceTest into a parameterized test fixture that enumerates all Hive storage formats. Until HIVE-5976 is implemented, we would need to manually create the mapping of SerDe to InputFormat and OutputFormat. This way, we can explicitly keep track of which storage formats currently work with HCatalog or which ones are untested or have test failures. The test fixture should also use Reflection to find all classes in the classpath that implements the SerDe interface and raise a failure if any of them are not enumerated. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6806: - Resolution: Fixed Fix Version/s: 0.14.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Ashish! CREATE TABLE should support STORED AS AVRO -- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro Fix For: 0.14.0 Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.3.patch, HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6806: - Summary: CREATE TABLE should support STORED AS AVRO (was: Native Avro support in Hive) CREATE TABLE should support STORED AS AVRO -- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071321#comment-14071321 ] Carl Steinbach commented on HIVE-6806: -- [~tomwhite], [~lars_francke], [~brocknoland]: Are you guys satisfied with the current version of the patch? If so I'll plan to +1 it and get it committed after another round of automated tests. Thanks. CREATE TABLE should support STORED AS AVRO -- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6806) Native Avro support in Hive
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060995#comment-14060995 ] Carl Steinbach commented on HIVE-6806: -- Does anyone object to changing the summary of this ticket to CREATE TABLE should support STORED AS AVRO? The current description can be misinterpreted to mean that this patch is adding the AvroSerDe. Native Avro support in Hive --- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro Attachments: HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5976: - Status: Open (was: Patch Available) [~davidzchen], I'd like to review this and get it committed. Can you post an RB? Thanks! Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7094: - Resolution: Fixed Fix Version/s: 0.14.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks David! Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Fix For: 0.14.0 Attachments: HIVE-7094.1.patch, HIVE-7094.3.patch, HIVE-7094.4.patch, HIVE-7094.5.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037893#comment-14037893 ] Carl Steinbach commented on HIVE-7094: -- [~sushanth]: I'm planning to commit this patch tonight. Please let me know if I should hold off. Thanks. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch, HIVE-7094.3.patch, HIVE-7094.4.patch, HIVE-7094.5.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7094: - Status: Open (was: Patch Available) Looks like some tests failed. [~davidzchen], can you please take a look? Thanks. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch, HIVE-7094.3.patch, HIVE-7094.4.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031188#comment-14031188 ] Carl Steinbach commented on HIVE-7094: -- [~davidzchen]: +1. Can you please attach a new version of the patch to trigger testing? If everything passes I will commit. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7094: - Status: Open (was: Patch Available) [~davidzchen]: I left some comments on rb. Thanks. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4734) Use custom ObjectInspectors for AvroSerde
[ https://issues.apache.org/jira/browse/HIVE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4734: - Status: Open (was: Patch Available) [~mwagner]: I'd like to get this committed. Would you mind rebasing the patch against trunk? Thanks. Use custom ObjectInspectors for AvroSerde - Key: HIVE-4734 URL: https://issues.apache.org/jira/browse/HIVE-4734 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Mark Wagner Assignee: Mark Wagner Attachments: HIVE-4734.1.patch, HIVE-4734.2.patch, HIVE-4734.3.patch, HIVE-4734.4.patch, HIVE-4734.5.patch Currently, the AvroSerde recursively copies all fields of a record from the GenericRecord to a List row object and provides the standard ObjectInspectors. Performance can be improved by providing ObjectInspectors to the Avro record itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7094: - Component/s: HCatalog Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14016200#comment-14016200 ] Carl Steinbach commented on HIVE-7094: -- [~davidzchen]: Is this ready for review now? Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
[ https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7110: - Status: Open (was: Patch Available) [~davidzchen]: Thanks for looking into this. Before we fix this problem I think it's important to determine whether or not the test ever worked, and if so, when and why it stopped working. I'm particularly curious to know if these properties were previously getting set by Maven in a manner that made them accessible to all HCatalog tests. Can you please look into this? Thanks. TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile - Key: HIVE-7110 URL: https://issues.apache.org/jira/browse/HIVE-7110 Project: Hive Issue Type: Bug Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, HIVE-7110.4.patch I got the following TestHCatPartitionPublish test failure when running all unit tests against Hadoop 1. This also appears when testing against Hadoop 2. {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish) Time elapsed: 1.361 sec ERROR! org.apache.hive.hcatalog.common.HCatException: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : java.io.IOException: No FileSystem for scheme: pfile at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014172#comment-14014172 ] Carl Steinbach commented on HIVE-4629: -- Does anyone think this is the right way to implement this feature? HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013214#comment-14013214 ] Carl Steinbach commented on HIVE-7094: -- [~davidzchen]: Is this patch ready for review? Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5857) Reduce tasks do not work in uber mode in YARN
[ https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5857: - Status: Open (was: Patch Available) [~kawaa]: Thanks for providing a fix. In order to get committed this patch needs a test case. Can you please add a qfile test that enables ubertask mode before running an aggregation query? Thanks. Reduce tasks do not work in uber mode in YARN - Key: HIVE-5857 URL: https://issues.apache.org/jira/browse/HIVE-5857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adam Kawa Priority: Critical Labels: plan, uber-jar, uberization, yarn Attachments: HIVE-5857.1.patch.txt A Hive query fails when it tries to run a reduce task in uber mode in YARN. The NullPointerException is thrown in the ExecReducer.configure method, because the plan file (reduce.xml) for a reduce task is not found. The Utilities.getBaseWork method is expected to return BaseWork object, but it returns NULL due to FileNotFoundException. {code} // org.apache.hadoop.hive.ql.exec.Utilities public static BaseWork getBaseWork(Configuration conf, String name) { ... try { ... if (gWork == null) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } InputStream in = new FileInputStream(localPath.toUri().getPath()); BaseWork ret = deserializePlan(in); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... } {code} It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method returns true, because immediately before running a reduce task, org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to local mode (mapreduce.framework.name is changed from yarn to local). On the other hand map tasks run successfully, because its configuration is not changed and still remains yarn. {code} // org.apache.hadoop.mapred.LocalContainerLauncher private void runSubtask(..) { ... conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME); conf.set(MRConfig.MASTER_ADDRESS, local); // bypass shuffle ReduceTask reduce = (ReduceTask)task; reduce.setConf(conf); reduce.run(conf, umbilical); } {code} A super quick fix could just an additional if-branch, where we check if we run a reduce task in uber mode, and then look for a plan file in a different location. *Java stacktrace* {code} 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116) ... 12 more 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1384392632998_34791_r_00_0 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
[jira] [Updated] (HIVE-5857) Reduce tasks do not work in uber mode in YARN
[ https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5857: - Assignee: Adam Kawa Reduce tasks do not work in uber mode in YARN - Key: HIVE-5857 URL: https://issues.apache.org/jira/browse/HIVE-5857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: Adam Kawa Assignee: Adam Kawa Priority: Critical Labels: plan, uber-jar, uberization, yarn Attachments: HIVE-5857.1.patch.txt A Hive query fails when it tries to run a reduce task in uber mode in YARN. The NullPointerException is thrown in the ExecReducer.configure method, because the plan file (reduce.xml) for a reduce task is not found. The Utilities.getBaseWork method is expected to return BaseWork object, but it returns NULL due to FileNotFoundException. {code} // org.apache.hadoop.hive.ql.exec.Utilities public static BaseWork getBaseWork(Configuration conf, String name) { ... try { ... if (gWork == null) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } InputStream in = new FileInputStream(localPath.toUri().getPath()); BaseWork ret = deserializePlan(in); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... } {code} It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method returns true, because immediately before running a reduce task, org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to local mode (mapreduce.framework.name is changed from yarn to local). On the other hand map tasks run successfully, because its configuration is not changed and still remains yarn. {code} // org.apache.hadoop.mapred.LocalContainerLauncher private void runSubtask(..) { ... conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME); conf.set(MRConfig.MASTER_ADDRESS, local); // bypass shuffle ReduceTask reduce = (ReduceTask)task; reduce.setConf(conf); reduce.run(conf, umbilical); } {code} A super quick fix could just an additional if-branch, where we check if we run a reduce task in uber mode, and then look for a plan file in a different location. *Java stacktrace* {code} 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116) ... 12 more 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1384392632998_34791_r_00_0 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1384392632998_34791_r_00_0 is : 0.0 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.Task: Runnning cleanup
[jira] [Updated] (HIVE-5857) Reduce tasks do not work in uber mode in YARN
[ https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5857: - Affects Version/s: 0.13.1 0.13.0 Reduce tasks do not work in uber mode in YARN - Key: HIVE-5857 URL: https://issues.apache.org/jira/browse/HIVE-5857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: Adam Kawa Assignee: Adam Kawa Priority: Critical Labels: plan, uber-jar, uberization, yarn Attachments: HIVE-5857.1.patch.txt A Hive query fails when it tries to run a reduce task in uber mode in YARN. The NullPointerException is thrown in the ExecReducer.configure method, because the plan file (reduce.xml) for a reduce task is not found. The Utilities.getBaseWork method is expected to return BaseWork object, but it returns NULL due to FileNotFoundException. {code} // org.apache.hadoop.hive.ql.exec.Utilities public static BaseWork getBaseWork(Configuration conf, String name) { ... try { ... if (gWork == null) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } InputStream in = new FileInputStream(localPath.toUri().getPath()); BaseWork ret = deserializePlan(in); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... } {code} It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method returns true, because immediately before running a reduce task, org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to local mode (mapreduce.framework.name is changed from yarn to local). On the other hand map tasks run successfully, because its configuration is not changed and still remains yarn. {code} // org.apache.hadoop.mapred.LocalContainerLauncher private void runSubtask(..) { ... conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME); conf.set(MRConfig.MASTER_ADDRESS, local); // bypass shuffle ReduceTask reduce = (ReduceTask)task; reduce.setConf(conf); reduce.run(conf, umbilical); } {code} A super quick fix could just an additional if-branch, where we check if we run a reduce task in uber mode, and then look for a plan file in a different location. *Java stacktrace* {code} 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116) ... 12 more 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1384392632998_34791_r_00_0 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1384392632998_34791_r_00_0 is : 0.0 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner]
[jira] [Updated] (HIVE-7104) Unit tests are disabled
[ https://issues.apache.org/jira/browse/HIVE-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7104: - Resolution: Fixed Fix Version/s: 0.14.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks David! Unit tests are disabled --- Key: HIVE-7104 URL: https://issues.apache.org/jira/browse/HIVE-7104 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Fix For: 0.14.0 Attachments: HIVE-7104.1.patch When I run {{mvn clean test -Phadoop-1|2}}, none of the unit tests are run. I did a binary search through the commit logs and found that the change that caused the unit tests to be disabled was the the change to the root pom.xml in the patch for HIVE-7067 (e77f38dc44de5a9b10bce8e0a2f1f5452f6921ed). Removing that change allowed the unit tests to be run again. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7066) hive-exec jar is missing avro core
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7066: - Component/s: Build Infrastructure hive-exec jar is missing avro core -- Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.0, 0.13.1 Reporter: David Chen Assignee: David Chen Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at
[jira] [Updated] (HIVE-7066) hive-exec jar is missing avro core
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7066: - Affects Version/s: 0.13.1 0.13.0 hive-exec jar is missing avro core -- Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.0, 0.13.1 Reporter: David Chen Assignee: David Chen Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at
[jira] [Updated] (HIVE-7066) hive-exec jar is missing avro core
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7066: - Resolution: Fixed Fix Version/s: 0.14.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks David! hive-exec jar is missing avro core -- Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.0, 0.13.1 Reporter: David Chen Assignee: David Chen Fix For: 0.14.0 Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at
[jira] [Commented] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998544#comment-13998544 ] Carl Steinbach commented on HIVE-3159: -- bq. Recently committed HIVE-5823, added some bug. [~kamrul]: HIVE-5823 was resolved as WONTFIX. Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.10.patch, HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159.9.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6974) Make Metastore Version Check work with Custom version suffixes
[ https://issues.apache.org/jira/browse/HIVE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995705#comment-13995705 ] Carl Steinbach commented on HIVE-6974: -- Looks like this was already fixed in HIVE-5484. Make Metastore Version Check work with Custom version suffixes -- Key: HIVE-6974 URL: https://issues.apache.org/jira/browse/HIVE-6974 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach HIVE-3764 added support for doing a version consistency check between the Hive JARs on the classpath and the metastore schema in the backend database. This is a nice feature, but it currently doesn't work for well for folks who appending the release version with their own suffixes, e.g. 0.12.0.li_20. We can fix this problem by modifying MetaStoreSchemaInfo.getHiveSchemaVersion() to match against ^\d+\.\d+\.\d+ and ignore anything that remains. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3159: - Status: Open (was: Patch Available) Looks like a couple tests failed. Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.10.patch, HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159.9.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6187: - Status: Patch Available (was: Open) Add test to verify that DESCRIBE TABLE works with quoted table names Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6187) Cannot use backticks around table name when using DESCRIBE query
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994965#comment-13994965 ] Carl Steinbach commented on HIVE-6187: -- I can confirm that this functionality is currently working on trunk, and also that it's broken in the 0.12.0 release. I'm not sure when it was fixed, and there doesn't appear to be any test coverage that will prevent someone from breaking it again in the future. Cannot use backticks around table name when using DESCRIBE query Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6187) Cannot use backticks around table name when using DESCRIBE query
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6187: - Attachment: HIVE-6187.1.patch Attaching a patch that adds several quoted testcases to describe_table.q. Cannot use backticks around table name when using DESCRIBE query Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6187: - Summary: Add test to verify that DESCRIBE TABLE works with quoted table names (was: Cannot use backticks around table name when using DESCRIBE query) Add test to verify that DESCRIBE TABLE works with quoted table names Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6974) Make Metastore Version Check work with Custom version suffixes
[ https://issues.apache.org/jira/browse/HIVE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-6974. -- Resolution: Duplicate Make Metastore Version Check work with Custom version suffixes -- Key: HIVE-6974 URL: https://issues.apache.org/jira/browse/HIVE-6974 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach HIVE-3764 added support for doing a version consistency check between the Hive JARs on the classpath and the metastore schema in the backend database. This is a nice feature, but it currently doesn't work for well for folks who appending the release version with their own suffixes, e.g. 0.12.0.li_20. We can fix this problem by modifying MetaStoreSchemaInfo.getHiveSchemaVersion() to match against ^\d+\.\d+\.\d+ and ignore anything that remains. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6974) Make Metastore Version Check work with Custom version suffixes
Carl Steinbach created HIVE-6974: Summary: Make Metastore Version Check work with Custom version suffixes Key: HIVE-6974 URL: https://issues.apache.org/jira/browse/HIVE-6974 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6974) Make Metastore Version Check work with Custom version suffixes
[ https://issues.apache.org/jira/browse/HIVE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6974: - Description: HIVE-3764 added support for doing a version consistency check between the Hive JARs on the classpath and the metastore schema in the backend database. This is a nice feature, but it currently doesn't work for well for folks who appending the release version with their own suffixes, e.g. 0.12.0.li_20. We can fix this problem by modifying MetaStoreSchemaInfo.getHiveSchemaVersion() to match against ^\d+\.\d+\.\d+ and ignore anything that remains. Make Metastore Version Check work with Custom version suffixes -- Key: HIVE-6974 URL: https://issues.apache.org/jira/browse/HIVE-6974 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach HIVE-3764 added support for doing a version consistency check between the Hive JARs on the classpath and the metastore schema in the backend database. This is a nice feature, but it currently doesn't work for well for folks who appending the release version with their own suffixes, e.g. 0.12.0.li_20. We can fix this problem by modifying MetaStoreSchemaInfo.getHiveSchemaVersion() to match against ^\d+\.\d+\.\d+ and ignore anything that remains. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980529#comment-13980529 ] Carl Steinbach commented on HIVE-3159: -- [~kamrul]: When we last chatted about this a couple months back you said you were planning to update the patch with some fixes. Are you still planning to do that? Thanks. Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972241#comment-13972241 ] Carl Steinbach commented on HIVE-6835: -- [~ashutoshc]: Thanks for catching the Thrift codegen problem. [~erwaman]: Updated patch looks good. +1 Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969281#comment-13969281 ] Carl Steinbach edited comment on HIVE-6835 at 4/15/14 7:00 AM: --- +1. Will wait for tests to pass before committing. was (Author: cwsteinbach): +1 Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969281#comment-13969281 ] Carl Steinbach commented on HIVE-6835: -- +1 Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6835: - Status: Open (was: Patch Available) [~erwaman]: Please see my comments on reviewboard. Thanks. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4329) HCatalog clients can't write to AvroSerde backed tables
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4329: - Assignee: David Chen HCatalog clients can't write to AvroSerde backed tables --- Key: HIVE-4329 URL: https://issues.apache.org/jira/browse/HIVE-4329 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.10.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sean Busbey Assignee: David Chen Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925491#comment-13925491 ] Carl Steinbach commented on HIVE-4629: -- Does the new version of the patch address any of the API design issues I mentioned earlier? HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6481) Add .reviewboardrc file
[ https://issues.apache.org/jira/browse/HIVE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925358#comment-13925358 ] Carl Steinbach commented on HIVE-6481: -- I updated the wiki some instructions for using rbt. Add .reviewboardrc file --- Key: HIVE-6481 URL: https://issues.apache.org/jira/browse/HIVE-6481 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.13.0 Attachments: HIVE-6481.1.patch, HIVE-6481.2.patch We should add a .reviewboardrc file to trunk in order to streamline the review process. Used in conjunction with RBTools this file makes posting a review request as simple as executing the following command: % rbt post -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6570: - Assignee: Anthony Hsu Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6482) Fix NOTICE file: pre release task
[ https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917696#comment-13917696 ] Carl Steinbach commented on HIVE-6482: -- [~rhbutani] Looks good to me. Fix NOTICE file: pre release task - Key: HIVE-6482 URL: https://issues.apache.org/jira/browse/HIVE-6482 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Attachments: HIVE-6482.1.patch, HIVE-6482.2.patch As per steps in Release doc: https://cwiki.apache.org/confluence/display/Hive/HowToRelease Removed projects with Apache license as per [~thejas] suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6481) Add .reviewboardrc file
[ https://issues.apache.org/jira/browse/HIVE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6481: - Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Add .reviewboardrc file --- Key: HIVE-6481 URL: https://issues.apache.org/jira/browse/HIVE-6481 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.13.0 Attachments: HIVE-6481.1.patch, HIVE-6481.2.patch We should add a .reviewboardrc file to trunk in order to streamline the review process. Used in conjunction with RBTools this file makes posting a review request as simple as executing the following command: % rbt post -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6024) Load data local inpath unnecessarily creates a copy task
[ https://issues.apache.org/jira/browse/HIVE-6024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6024: - Status: Open (was: Patch Available) I left a comment on RB. Thanks. Load data local inpath unnecessarily creates a copy task Key: HIVE-6024 URL: https://issues.apache.org/jira/browse/HIVE-6024 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Mohammad Kamrul Islam Attachments: HIVE-6024.1.patch, HIVE-6024.2.patch, HIVE-6024.3.patch Load data command creates an additional copy task only when its loading from {{local}} It doesn't create this additional copy task while loading from DFS though. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6024) Load data local inpath unnecessarily creates a copy task
[ https://issues.apache.org/jira/browse/HIVE-6024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915991#comment-13915991 ] Carl Steinbach commented on HIVE-6024: -- [~ashutoshc] The qfile test that is added in this patch doesn't seem to demonstrate anything. I think it should be removed. What do you think? Load data local inpath unnecessarily creates a copy task Key: HIVE-6024 URL: https://issues.apache.org/jira/browse/HIVE-6024 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Mohammad Kamrul Islam Attachments: HIVE-6024.1.patch, HIVE-6024.2.patch, HIVE-6024.3.patch Load data command creates an additional copy task only when its loading from {{local}} It doesn't create this additional copy task while loading from DFS though. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6481) Add .reviewboardrc file
[ https://issues.apache.org/jira/browse/HIVE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6481: - Attachment: HIVE-6481.2.patch Attaching a new version of the patch that includes the ASF license header in the .reviewboardrc file. Add .reviewboardrc file --- Key: HIVE-6481 URL: https://issues.apache.org/jira/browse/HIVE-6481 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-6481.1.patch, HIVE-6481.2.patch We should add a .reviewboardrc file to trunk in order to streamline the review process. Used in conjunction with RBTools this file makes posting a review request as simple as executing the following command: % rbt post -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6481) Add .reviewboardrc file
Carl Steinbach created HIVE-6481: Summary: Add .reviewboardrc file Key: HIVE-6481 URL: https://issues.apache.org/jira/browse/HIVE-6481 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach We should add a .reviewboardrc file to trunk in order to streamline the review process. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5783: - Labels: Parquet (was: ) Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Labels: Parquet Fix For: 0.13.0 Attachments: HIVE-5783.noprefix.patch, HIVE-5783.noprefix.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files
[ https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5998: - Labels: Parquet (was: ) Add vectorized reader for Parquet files --- Key: HIVE-5998 URL: https://issues.apache.org/jira/browse/HIVE-5998 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: Parquet Attachments: HIVE-5998.1.patch HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar format, it makes sense to provide a vectorized reader, similar to how RC and ORC formats have, to benefit from vectorized execution engine. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6394) Implement Timestmap in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6394: - Component/s: Serializers/Deserializers Implement Timestmap in ParquetSerde --- Key: HIVE-6394 URL: https://issues.apache.org/jira/browse/HIVE-6394 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Jarek Jarcec Cecho Labels: Parquet This JIRA is to implement timestamp support in Parquet SerDe. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6375) Implement CTAS and column rename for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6375: - Labels: Parquet (was: ) Implement CTAS and column rename for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Reporter: Brock Noland Priority: Critical Labels: Parquet More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6394) Implement Timestmap in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6394: - Labels: Parquet (was: ) Implement Timestmap in ParquetSerde --- Key: HIVE-6394 URL: https://issues.apache.org/jira/browse/HIVE-6394 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Jarek Jarcec Cecho Labels: Parquet This JIRA is to implement timestamp support in Parquet SerDe. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6367) Implement Decimal in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6367: - Labels: Parquet (was: ) Implement Decimal in ParquetSerde - Key: HIVE-6367 URL: https://issues.apache.org/jira/browse/HIVE-6367 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Brock Noland Assignee: Xuefu Zhang Labels: Parquet Some code in the Parquet Serde deals with decimal and other does not. For example in ETypeConverter we convert Decimal to double (which is invalid) whereas in DataWritableWriter and other locations we throw an exception if decimal is used. This JIRA is to implement decimal support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6384) Implement all Hive data types in Parquet
[ https://issues.apache.org/jira/browse/HIVE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6384: - Labels: Parquet (was: ) Implement all Hive data types in Parquet Key: HIVE-6384 URL: https://issues.apache.org/jira/browse/HIVE-6384 Project: Hive Issue Type: Task Reporter: Brock Noland Labels: Parquet Uber JIRA to track implementation of binary, timestamp, date, char, varchar or decimal. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6367) Implement Decimal in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6367: - Component/s: Serializers/Deserializers Implement Decimal in ParquetSerde - Key: HIVE-6367 URL: https://issues.apache.org/jira/browse/HIVE-6367 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Brock Noland Assignee: Xuefu Zhang Some code in the Parquet Serde deals with decimal and other does not. For example in ETypeConverter we convert Decimal to double (which is invalid) whereas in DataWritableWriter and other locations we throw an exception if decimal is used. This JIRA is to implement decimal support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6368) Document parquet on hive wiki
[ https://issues.apache.org/jira/browse/HIVE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6368: - Component/s: Serializers/Deserializers Document parquet on hive wiki - Key: HIVE-6368 URL: https://issues.apache.org/jira/browse/HIVE-6368 Project: Hive Issue Type: Task Components: Serializers/Deserializers Reporter: Brock Noland Assignee: Brock Noland Priority: Critical Labels: Parquet Fix For: 0.13.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors
[ https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6414: - Component/s: Serializers/Deserializers ParquetInputFormat provides data values that do not match the object inspectors --- Key: HIVE-6414 URL: https://issues.apache.org/jira/browse/HIVE-6414 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Remus Rusanu Assignee: Justin Coffey While working on HIVE-5998 I noticed that the ParquetRecordReader returns IntWritable for all 'int like' types, in disaccord with the row object inspectors. I though fine, and I worked my way around it. But I see now that the issue trigger failuers in other places, eg. in aggregates: {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) ... 15 more {noformat} My test is (I'm writing a test .q from HIVE-5998, but the repro does not involve vectorization): {noformat} create table if not exists alltypes_parquet ( cint int, ctinyint tinyint, csmallint smallint, cfloat float, cdouble double, cstring1 string) stored as parquet; insert overwrite table alltypes_parquet select cint, ctinyint, csmallint, cfloat, cdouble, cstring1 from alltypesorc; explain select * from alltypes_parquet limit 10; select * from alltypes_parquet limit 10; explain select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6366) Refactor some items in Hive Parquet
[ https://issues.apache.org/jira/browse/HIVE-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6366: - Labels: Parquet (was: ) Refactor some items in Hive Parquet --- Key: HIVE-6366 URL: https://issues.apache.org/jira/browse/HIVE-6366 Project: Hive Issue Type: Task Reporter: Brock Noland Labels: Parquet [~jcoffey] and myself have discussed some re-factoring to the Parquet Serde we'd live to do post commit. Specifically * Clean up the labeled TODO items in the parquet code * Understand if the paths need to be convered to schemless paths and then cleanup the handling of paths in ProjectPusher * Object inspectors are written such that they can inspect inspected results (e.g. if Hive decides to inspect the result of an inspection) which should not happen * BinaryWritable can be removed -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files
[ https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5998: - Component/s: Vectorization Serializers/Deserializers Add vectorized reader for Parquet files --- Key: HIVE-5998 URL: https://issues.apache.org/jira/browse/HIVE-5998 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers, Vectorization Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: Parquet Attachments: HIVE-5998.1.patch HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar format, it makes sense to provide a vectorized reader, similar to how RC and ORC formats have, to benefit from vectorized execution engine. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors
[ https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6414: - Labels: Parquet (was: ) ParquetInputFormat provides data values that do not match the object inspectors --- Key: HIVE-6414 URL: https://issues.apache.org/jira/browse/HIVE-6414 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Remus Rusanu Assignee: Justin Coffey Labels: Parquet While working on HIVE-5998 I noticed that the ParquetRecordReader returns IntWritable for all 'int like' types, in disaccord with the row object inspectors. I though fine, and I worked my way around it. But I see now that the issue trigger failuers in other places, eg. in aggregates: {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) ... 15 more {noformat} My test is (I'm writing a test .q from HIVE-5998, but the repro does not involve vectorization): {noformat} create table if not exists alltypes_parquet ( cint int, ctinyint tinyint, csmallint smallint, cfloat float, cdouble double, cstring1 string) stored as parquet; insert overwrite table alltypes_parquet select cint, ctinyint, csmallint, cfloat, cdouble, cstring1 from alltypesorc; explain select * from alltypes_parquet limit 10; select * from alltypes_parquet limit 10; explain select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4558) mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4558: - Assignee: (was: Carl Steinbach) mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows -- Key: HIVE-4558 URL: https://issues.apache.org/jira/browse/HIVE-4558 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Attachments: HIVE-4558.1.patch testNegativeCliDriver_mapreduce_stack_trace_hadoop20 fails because group information is printed out on Windows. Here is the example of mapreduce_stack_trace_hadoop20.q.out.orig: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: hdfs://127.0.0.1:25477/code/HWX/hive-monarch/build/ql/scratchdir/hive_2013-05-14_15-21-00_075_593034964465269090/-mr-1 Ended Job = job_20130514152027587_0001 with errors FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} groups found for user Administrators Hive Runtime Error while processing row {key:238,value:val_238} -- However, it is supposed to look like: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src \ A masked pattern was here FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} Hive Runtime Error while processing row {key:238,value:val_238} -- -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4558) mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4558: - Assignee: Carl Steinbach mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows -- Key: HIVE-4558 URL: https://issues.apache.org/jira/browse/HIVE-4558 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Assignee: Carl Steinbach Attachments: HIVE-4558.1.patch testNegativeCliDriver_mapreduce_stack_trace_hadoop20 fails because group information is printed out on Windows. Here is the example of mapreduce_stack_trace_hadoop20.q.out.orig: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: hdfs://127.0.0.1:25477/code/HWX/hive-monarch/build/ql/scratchdir/hive_2013-05-14_15-21-00_075_593034964465269090/-mr-1 Ended Job = job_20130514152027587_0001 with errors FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} groups found for user Administrators Hive Runtime Error while processing row {key:238,value:val_238} -- However, it is supposed to look like: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src \ A masked pattern was here FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} Hive Runtime Error while processing row {key:238,value:val_238} -- -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4558) mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4558: - Assignee: Xi Fang mapreduce_stack_trace_hadoop20 in TestNegativeMinimrCliDriver fails on Windows -- Key: HIVE-4558 URL: https://issues.apache.org/jira/browse/HIVE-4558 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Attachments: HIVE-4558.1.patch testNegativeCliDriver_mapreduce_stack_trace_hadoop20 fails because group information is printed out on Windows. Here is the example of mapreduce_stack_trace_hadoop20.q.out.orig: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src PREHOOK: Output: hdfs://127.0.0.1:25477/code/HWX/hive-monarch/build/ql/scratchdir/hive_2013-05-14_15-21-00_075_593034964465269090/-mr-1 Ended Job = job_20130514152027587_0001 with errors FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} groups found for user Administrators Hive Runtime Error while processing row {key:238,value:val_238} -- However, it is supposed to look like: -- PREHOOK: query: FROM src SELECT TRANSFORM(key, value) USING 'script_does_not_exist' AS (key, value) PREHOOK: type: QUERY PREHOOK: Input: default@src \ A masked pattern was here FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} Hive Runtime Error while processing row {key:238,value:val_238} -- -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3129) Create windows native scripts (CMD files) to run hive on windows without Cygwin
[ https://issues.apache.org/jira/browse/HIVE-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3129: - Assignee: Xi Fang Create windows native scripts (CMD files) to run hive on windows without Cygwin Key: HIVE-3129 URL: https://issues.apache.org/jira/browse/HIVE-3129 Project: Hive Issue Type: Bug Components: CLI, Windows Affects Versions: 0.11.0 Reporter: Kanna Karanam Assignee: Xi Fang Labels: Windows Attachments: HIVE-3129.1.patch, HIVE-3129.2.patch, HIVE-3129.unittest.2.patch, HIVE-3129.unittest.patch Create the cmd files equivalent to a)Bin\hive b)Bin\hive-config.sh c)Bin\Init-hive-dfs.sh d)Bin\ext\cli.sh e)Bin\ext\debug.sh f)Bin\ext\help.sh g)Bin\ext\hiveserver.sh h)Bin\ext\jar.sh i)Bin\ext\hwi.sh j)Bin\ext\lineage.sh k)Bin\ext\metastore.sh l)Bin\ext\rcfilecat.sh -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters
[ https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4349: - Assignee: Xi Fang Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters Key: HIVE-4349 URL: https://issues.apache.org/jira/browse/HIVE-4349 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Xi Fang Assignee: Xi Fang Attachments: HIVE-4349.1.patch If the Hive enlistment root path is longer than 12 chars then test classpath “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the Hive unit tests on Windows. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
[ https://issues.apache.org/jira/browse/HIVE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4445: - Assignee: Xi Fang Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Attachments: HIVE-4445.1.patch The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881624#comment-13881624 ] Carl Steinbach commented on HIVE-5783: -- I noticed that this SerDe doesn't support several of Hive's types: binary, timestamp, date, and probably a couple others as well. If there other known limitations it would be helpful to list them. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk
[ https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6043: - Description: We need to document incompatible changes. For example * HIVE-5372 changed object inspector hierarchy breaking most if not all custom serdes * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom serdes (fixed by HIVE-5380) * Hive 0.12 (HIVE-4825) separates MapredWork into MapWork and ReduceWork which is used by Serdes * HIVE-5411 serializes expressions with Kryo which are used by custom serdes * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag was introduced in Hive 0.11 by HIVE-3952). was: We need to document incompatible changes. For example * HIVE-5372 changed object inspector hierarchy breaking most if not all custom serdes * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom serdes (fixed by HIVE-5380) * Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by Serdes * HIVE-5411 serializes expressions with Kryo which are used by custom serdes * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag was introduced in Hive 0.11 by HIVE-3952). Document incompatible changes in Hive 0.12 and trunk Key: HIVE-6043 URL: https://issues.apache.org/jira/browse/HIVE-6043 Project: Hive Issue Type: Task Reporter: Brock Noland Priority: Blocker We need to document incompatible changes. For example * HIVE-5372 changed object inspector hierarchy breaking most if not all custom serdes * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom serdes (fixed by HIVE-5380) * Hive 0.12 (HIVE-4825) separates MapredWork into MapWork and ReduceWork which is used by Serdes * HIVE-5411 serializes expressions with Kryo which are used by custom serdes * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag was introduced in Hive 0.11 by HIVE-3952). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875419#comment-13875419 ] Carl Steinbach commented on HIVE-5783: -- I noticed that many of the source files contain Criteo copyright notices. The ASF has a policy on this which is documented here: https://www.apache.org/legal/src-headers.html Since this patch was submitted directly to the ASF by the copyright owner or owner's agent it sounds like we have three options for handling this: # Remove the notices # move them to the NOTICE file associated with each applicable project release, or # provide written permission for the ASF to make such removal or relocation of the notices [~jcoffey] [~rusanu] Do you guys have preference? Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Attachments: HIVE-5783.patch, HIVE-5783.patch, hive-0.11-parquet.patch, parquet-hive.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3159: - Status: Open (was: Patch Available) I left some comments on reviewboard. Thanks. Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5829: - Fix Version/s: 0.13.0 Committed to trunk. Thanks Mohammad! Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: 0.13.0 Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, HIVE-5829.3.patch, HIVE-5829.4.patch, tmp.HIVE-5829.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863408#comment-13863408 ] Carl Steinbach commented on HIVE-6050: -- I think running an older JDBC driver against a newer server version is going to be the more common scenario since there will always be cases of clients that are slow to upgrade. JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Szehon Ho Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6050: - Component/s: HiveServer2 JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Szehon Ho Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863426#comment-13863426 ] Carl Steinbach commented on HIVE-6050: -- It looks like Thrift IDL is not backward compatible wrt to Enums. We use Enums in other places in the IDL (e.g. TTypeId, TStatusCode, TOperationState, TOperationType, TGetTypeInfo, TFetchOrientation), and should probably investigate whether these references need to updated as well. I'm convinced that using an Enum for TGetTypeInfo was a bad idea, and suspect that the same may also be true for TTypeId. JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Szehon Ho Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by
[jira] [Assigned] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-6050: Assignee: Carl Steinbach JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)