[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3159: - Status: Open (was: Patch Available) [~kamrul] Is the failure in TestMinimrCliDriver.testCliDriver_bucket_num_reducers reproducible? Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863887#comment-13863887 ] Carl Steinbach commented on HIVE-5829: -- +1 Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, HIVE-5829.3.patch, HIVE-5829.4.patch, tmp.HIVE-5829.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861034#comment-13861034 ] Carl Steinbach commented on HIVE-3746: -- [~vaibhavgumashta] Cool! [~navis] Thanks again for putting this patch together! I'm in the process of committing it now. Fix HS2 ResultSet Serialization Performance Regression -- Key: HIVE-3746 URL: https://issues.apache.org/jira/browse/HIVE-3746 Project: Hive Issue Type: Sub-task Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Assignee: Navis Labels: HiveServer2, jdbc, thrift Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, HIVE-3746.3.patch.txt, HIVE-3746.4.patch.txt, HIVE-3746.5.patch.txt, HIVE-3746.6.patch.txt, HIVE-3746.7.patch.txt -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3746: - Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Navis! Fix HS2 ResultSet Serialization Performance Regression -- Key: HIVE-3746 URL: https://issues.apache.org/jira/browse/HIVE-3746 Project: Hive Issue Type: Sub-task Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Assignee: Navis Labels: HiveServer2, jdbc, thrift Fix For: 0.13.0 Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, HIVE-3746.3.patch.txt, HIVE-3746.4.patch.txt, HIVE-3746.5.patch.txt, HIVE-3746.6.patch.txt, HIVE-3746.7.patch.txt -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
[ https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5911: - Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Sergey! Recent change to schema upgrade scripts breaks file naming conventions -- Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Sergey Shelukhin Fix For: 0.13.0 Attachments: HIVE-5911.01.patch, HIVE-5911.patch The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859755#comment-13859755 ] Carl Steinbach commented on HIVE-3746: -- +1 I'm fine with committing the patch in its current state, but there's one thing I think we definitely need to fix ASAP in a followup patch. Up to this point we have managed to avoid polluting the client and service class interfaces ( i.e. CLIService and CLIServiceClient) with direct references to the Thrift serialization layer. This patch breaks that rule by exposing TProtocolVersion in the public methods of CliService. Only ThriftCLIService should need to know that the client is using a specific version of the Thrift serialization layer. Fix HS2 ResultSet Serialization Performance Regression -- Key: HIVE-3746 URL: https://issues.apache.org/jira/browse/HIVE-3746 Project: Hive Issue Type: Sub-task Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Assignee: Navis Labels: HiveServer2, jdbc, thrift Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, HIVE-3746.3.patch.txt, HIVE-3746.4.patch.txt, HIVE-3746.5.patch.txt, HIVE-3746.6.patch.txt, HIVE-3746.7.patch.txt -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3159: - Status: Open (was: Patch Available) Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5829: - Status: Open (was: Patch Available) [~kamrul] I noted one small issue on RB related to the package names of the new tests. Other than that I think the patch is ready to commit. Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, tmp.HIVE-5829.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5829: - Component/s: UDF Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, tmp.HIVE-5829.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5879) Fix spelling errors in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848936#comment-13848936 ] Carl Steinbach commented on HIVE-5879: -- bq. I think generating hive-default.xml.template from HiveConf.ConfVars might be better option (making large texts included in HiveConf). Any opinions? +1 Fix spelling errors in hive-default.xml --- Key: HIVE-5879 URL: https://issues.apache.org/jira/browse/HIVE-5879 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Lefty Leverenz Priority: Trivial Labels: documentation Fix For: 0.13.0 Attachments: HIVE-5879.2.patch.txt, HIVE-5879.patch See https://issues.apache.org/jira/browse/HIVE-5400?focusedCommentId=13830626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13830626 -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Resolved] (HIVE-4977) HS2: support an alternate resultset serialization format between client and server
[ https://issues.apache.org/jira/browse/HIVE-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-4977. -- Resolution: Duplicate Resolving this as a duplicate of HIVE-3746. HS2: support an alternate resultset serialization format between client and server -- Key: HIVE-4977 URL: https://issues.apache.org/jira/browse/HIVE-4977 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Current serialization protocol between client and server as defined in cli_service.thrift results in 2x (or more) throughput degradation compared to HS1. Initial proposal is to introduce HS1 serialization protocol as a negotiable alternative. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Resolved] (HIVE-5972) Hiveserver2 is much slower than hiveserver1
[ https://issues.apache.org/jira/browse/HIVE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-5972. -- Resolution: Duplicate Resolving this as a duplicate of HIVE-3746. Hiveserver2 is much slower than hiveserver1 --- Key: HIVE-5972 URL: https://issues.apache.org/jira/browse/HIVE-5972 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0 Reporter: a bc Priority: Critical we are building ms sql cube by linkedserver connectiong hiveserver with Cloudera's ODBC driver. There are two test results: 1. hiveserver1 running on 2CPUs, 8G mem, took about 8 hours 2. hiveserver2 running on 4CPUs, 16 mem, took about 13 hours and 27min (never successful on machine with 2CPUs, 8G mem) Although on both cases, almost all CPUs are busy when building cube. But I cannot understand why hiveserver2 is much slower than hiveserver1, because from doc, hs2 support concurrency, it should be faster than hs1, isn't it? Thanks. CDH4.3 on CentOS6. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3746: - Status: Open (was: Patch Available) [~navis] Thanks for working on this! I added some comments on RB. My main concern with the current patch is that it breaks backward compatibility with older clients. Fix HS2 ResultSet Serialization Performance Regression -- Key: HIVE-3746 URL: https://issues.apache.org/jira/browse/HIVE-3746 Project: Hive Issue Type: Sub-task Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Assignee: Navis Labels: HiveServer2 Attachments: HIVE-3746.1.patch.txt, HIVE-3746.2.patch.txt, HIVE-3746.3.patch.txt -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HIVE-6020) Support UUIDs and versioning for DBs/Tables/Partitions/Columns
Carl Steinbach created HIVE-6020: Summary: Support UUIDs and versioning for DBs/Tables/Partitions/Columns Key: HIVE-6020 URL: https://issues.apache.org/jira/browse/HIVE-6020 Project: Hive Issue Type: Bug Components: Database/Schema, Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5783: - Component/s: Serializers/Deserializers Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Attachments: HIVE-5783.patch, hive-0.11-parquet.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5783: - Fix Version/s: (was: 0.11.0) Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Attachments: HIVE-5783.patch, hive-0.11-parquet.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843728#comment-13843728 ] Carl Steinbach commented on HIVE-5230: -- I'm looking at it now. [~thejas] If you don't hear back from me before 11:56am tomorrow you should feel free to go ahead and commit the patch. Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch, HIVE-5230.9.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843918#comment-13843918 ] Carl Steinbach commented on HIVE-5783: -- bq. on the parquet-hive side, we're good to submit a new patch with direct serde integration :) bq. I humbly submit that the two are not linked and one should not impede the other. I agree. It wasn't my intention to imply that these issues were linked. Sorry if that wasn't clear. In addition to the SerDe can please also include some test cases? I think it would be good to aim for coverage on par with what was provided with OrcFile. Also, the data/files directory contains two files (alltypes.txt and alltypesorc) which will make testing type support a lot easier. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.11.0 Attachments: HIVE-5783.patch, hive-0.11-parquet.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842804#comment-13842804 ] Carl Steinbach commented on HIVE-5783: -- [~brocknoland] Up to this point we have reserved first-class support for data formats in Hive (i.e. changing the grammar) to formats that are implemented natively in the Hive source repository. I think we should maintain this convention. There are a couple option available if we feel that it's important for users to be able to create Parquet formatted tables using the abbreviated syntax: # Add a format registry feature to Hive that allows admins to register third-party SerDe implementations and associate them with a format keyword that users can reference in a DDL statement. # Maintain two copies of the Parquet SerDe implementation -- one in Hive and one in the parquet-mr repository -- and backport patches between these repositories as necessary. If users want to use the parquet-mr version of the SerDe with Hive they may do so by referencing the third-party package name in their DDL. On a side note I think the ticket summary Native Parquet Support in Hive is misleading. Users who see this description in the release notes will conclude that the Parquet SerDe code lives in Hive when the exact opposite is true. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.11.0 Attachments: HIVE-5783.patch, hive-0.11-parquet.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
[ https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842144#comment-13842144 ] Carl Steinbach commented on HIVE-5911: -- +1 (will commit in 24 hrs) Recent change to schema upgrade scripts breaks file naming conventions -- Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Sergey Shelukhin Attachments: HIVE-5911.01.patch, HIVE-5911.patch The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4395) Support TFetchOrientation.FIRST for HiveServer2 FetchResults
[ https://issues.apache.org/jira/browse/HIVE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4395: - Status: Open (was: Patch Available) I added a couple comments on RB. Thanks. Support TFetchOrientation.FIRST for HiveServer2 FetchResults Key: HIVE-4395 URL: https://issues.apache.org/jira/browse/HIVE-4395 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4395-1.patch, HIVE-4395.1.patch, HIVE-4395.2.patch, HIVE-4395.3.patch, HIVE-4395.4.patch Currently HiveServer2 only support fetching next row (TFetchOrientation.NEXT). This ticket is to implement support for TFetchOrientation.FIRST that resets the fetch position at the begining of the resultset. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5977) Ability to selectively enable/disable WebHCat REST API components
Carl Steinbach created HIVE-5977: Summary: Ability to selectively enable/disable WebHCat REST API components Key: HIVE-5977 URL: https://issues.apache.org/jira/browse/HIVE-5977 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5977) Ability to selectively enable/disable WebHCat REST API components
[ https://issues.apache.org/jira/browse/HIVE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841772#comment-13841772 ] Carl Steinbach commented on HIVE-5977: -- The RPCs in the [WebHCat REST API|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference] can be divided divided into the following categories: # General (version, status, etc) # DDL # MapReduce job submission # Pig job submission # Hive job submission # Queue (job management, deprecated) # Jobs (job management) We should provide administrators with the ability to selectively disable categories 2-7 by setting properties in the WebHCat configuration file. By default all categories will be enabled. Ability to selectively enable/disable WebHCat REST API components - Key: HIVE-5977 URL: https://issues.apache.org/jira/browse/HIVE-5977 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Carl Steinbach Assignee: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841855#comment-13841855 ] Carl Steinbach commented on HIVE-5783: -- [~jcoffey] Would you and your coworkers be willing to consider the option of committing the SerDe code directly to Hive instead of having Hive depend on a third-party JAR? I appreciate that this will make it a little less convenient for you to push in changes. However, I think there are two big drawbacks to the third-party JAR approach: 1) existing Hive contributors will be much less likely contribute improvements to this code since it lives in a different repository, and 2) Hive won't be able to benefit from parquet-serde improvements until they appear in a new parquet-serde release. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.11.0 Attachments: HIVE-5783.patch, hive-0.11-parquet.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3746) Fix HS2 ResultSet Serialization Performance Regression
[ https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3746: - Summary: Fix HS2 ResultSet Serialization Performance Regression (was: Fix HS2 Fetch performance regression) Fix HS2 ResultSet Serialization Performance Regression -- Key: HIVE-3746 URL: https://issues.apache.org/jira/browse/HIVE-3746 Project: Hive Issue Type: Sub-task Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Labels: HiveServer2 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5276) Skip redundant string encoding/decoding for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840924#comment-13840924 ] Carl Steinbach commented on HIVE-5276: -- [~navis] I +1d this patch earlier so you're free to commit if the automated tests pass. Skip redundant string encoding/decoding for hiveserver2 --- Key: HIVE-5276 URL: https://issues.apache.org/jira/browse/HIVE-5276 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-5276.3.patch.txt, HIVE-5276.4.patch.txt, HIVE-5276.5.patch.txt Current hiveserver2 acquires rows in string format which is used for cli output. Then convert them into row again and convert to final format lastly. This is inefficient and memory consuming. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5972) Hiveserver2 is much slower than hiveserver1
[ https://issues.apache.org/jira/browse/HIVE-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840939#comment-13840939 ] Carl Steinbach commented on HIVE-5972: -- {quote} 1. hiveserver1 running on 2CPUs, 8G mem, took about 8 hours 2. hiveserver2 running on 4CPUs, 16 mem, took about 13 hours and 27min (never successful on machine with 2CPUs, 8G mem) {quote} How large was the resultset that your query generated? How much time did hs1/hs2 spend executing the query on the cluster as opposed to fetching the resultset through hs1/hs2? bq. But I cannot understand why hiveserver2 is much slower than hiveserver1, because from doc, hs2 support concurrency, it should be faster than hs1, isn't it? In this context concurrency means that HiveServer2 is able to handle multiple concurrent client connections, something that HS1 can't do. Also, regardless of whether you're using HS1 or HS2, the actual work of executing the query is delegated to the underlying MR cluster. The HS1/HS2 server coordinates query execution on the cluster and then acts as a relay through which the client fetches the results of the query. It's this last step (the result set fetch) which is known to be slower on HS2, and we're tracking the task of fixing this performance regression in HIVE-3746. So far the information you have provided makes me think that this is a duplicate of HIVE-3746. Please let us know if you think some other issue is causing the performance regression. Hiveserver2 is much slower than hiveserver1 --- Key: HIVE-5972 URL: https://issues.apache.org/jira/browse/HIVE-5972 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0 Reporter: a bc Priority: Critical we are building ms sql cube by linkedserver connectiong hiveserver with Cloudera's ODBC driver. There are two test results: 1. hiveserver1 running on 2CPUs, 8G mem, took about 8 hours 2. hiveserver2 running on 4CPUs, 16 mem, took about 13 hours and 27min (never successful on machine with 2CPUs, 8G mem) Although on both cases, almost all CPUs are busy when building cube. But I cannot understand why hiveserver2 is much slower than hiveserver1, because from doc, hs2 support concurrency, it should be faster than hs1, isn't it? Thanks. CDH4.3 on CentOS6. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
[ https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836812#comment-13836812 ] Carl Steinbach commented on HIVE-5911: -- Yes. Please put the upgrade SQL commands in a file named 015-HIVE-5700.db.sql and call it from upgrade-0.12.0-to-0.13.0.db.sql. Recent change to schema upgrade scripts breaks file naming conventions -- Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Sergey Shelukhin The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
[ https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837069#comment-13837069 ] Carl Steinbach commented on HIVE-5911: -- The MySQL and Postgresql upgrade scripts are supposed to start by printing the name of the corresponding patch to stdout, e.g: SELECT ' HIVE-3255 Storing delegation tokens in metastore '; I noticed that this echo statement is missing in the following scripts: MySQL: 14, 15 Postgresql: 11, 14, 15 Would you mind fixing this issue in this patch? Everything else looks good to me. Recent change to schema upgrade scripts breaks file naming conventions -- Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Sergey Shelukhin Attachments: HIVE-5911.patch The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
Carl Steinbach created HIVE-5911: Summary: Recent change to schema upgrade scripts breaks file naming conventions Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions
[ https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836330#comment-13836330 ] Carl Steinbach commented on HIVE-5911: -- [~sershe] Can you please fix this? Thanks. Recent change to schema upgrade scripts breaks file naming conventions -- Key: HIVE-5911 URL: https://issues.apache.org/jira/browse/HIVE-5911 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach The changes made in HIVE-5700 break the convention for naming schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5879) Fix spelling errors in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834172#comment-13834172 ] Carl Steinbach commented on HIVE-5879: -- bq. It seems liked it would be nice to doc the conf values in HiveConf and then have a main method which generates the syntax needed for the wiki. I have wanted to do the same thing for a long time. As I recall we ended up with hive-default.xml.template because some folks argued that it was easier to access than pulling up the right wiki page. I think these concerns could largely be addressed by adding a DESCRIBE CONF or SHOW CONF command. Fix spelling errors in hive-default.xml --- Key: HIVE-5879 URL: https://issues.apache.org/jira/browse/HIVE-5879 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Lefty Leverenz Priority: Trivial See https://issues.apache.org/jira/browse/HIVE-5400?focusedCommentId=13830626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13830626 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5898) Make fetching of column statistics configurable
[ https://issues.apache.org/jira/browse/HIVE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5898: - Status: Open (was: Patch Available) Please post the patch on reviewboard. Thanks. Make fetching of column statistics configurable --- Key: HIVE-5898 URL: https://issues.apache.org/jira/browse/HIVE-5898 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5898.1.patch This is a subtask of HIVE-5369. Current metastore api for fetching column statistics can only be done for per partition and per column basis. This will impact the performance of statistics annotation (during explain or explain extended) when the number of partitions and number of columns are large. Until metastore api is fixed, make fetching of column statistics configurable and set its default to false. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834224#comment-13834224 ] Carl Steinbach commented on HIVE-5230: -- I'm looking at it now. Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834296#comment-13834296 ] Carl Steinbach commented on HIVE-5230: -- [~vaibhavgumashta] The current version of the patch introduces a lot of formatting errors (looks like many files had their 2 space indents changed to TABs). Can you please fix these issues and post and updated version on reviewboard? Thanks. Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5893) hive-schema-0.13.0.mysql.sql contains reference to nonexistent column
Carl Steinbach created HIVE-5893: Summary: hive-schema-0.13.0.mysql.sql contains reference to nonexistent column Key: HIVE-5893 URL: https://issues.apache.org/jira/browse/HIVE-5893 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5893) hive-schema-0.13.0.mysql.sql contains reference to nonexistent column
[ https://issues.apache.org/jira/browse/HIVE-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832960#comment-13832960 ] Carl Steinbach commented on HIVE-5893: -- {noformat} mysql source hive-schema-0.13.0.mysql.sql source hive-schema-0.13.0.mysql.sql ... ERROR 1054 (42S22): Unknown column 'VERSION_COMMENT' in 'field list' Query OK, 0 rows affected (0.00 sec) ... {noformat} Here's the problem: {noformat} % diff hive-schema-0.12.0.mysql.sql hive-schema-0.13.0.mysql.sql diff hive-schema-0.12.0.mysql.sql hive-schema-0.13.0.mysql.sql 760c760 `VERSION_COMMENT` VARCHAR(255), --- `COMMENT` VARCHAR(255), 764c764 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES (1, '0.12.0', 'Hive release version 0.12.0'); --- INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES (1, '0.13.0', 'Hive release version 0.13.0'); % {noformat} hive-schema-0.13.0.mysql.sql contains reference to nonexistent column - Key: HIVE-5893 URL: https://issues.apache.org/jira/browse/HIVE-5893 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Carl Steinbach -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5893) hive-schema-0.13.0.mysql.sql contains reference to nonexistent column
[ https://issues.apache.org/jira/browse/HIVE-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5893: - Attachment: HIVE-5892.1.patch.txt hive-schema-0.13.0.mysql.sql contains reference to nonexistent column - Key: HIVE-5893 URL: https://issues.apache.org/jira/browse/HIVE-5893 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Carl Steinbach Attachments: HIVE-5892.1.patch.txt -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5893) hive-schema-0.13.0.mysql.sql contains reference to nonexistent column
[ https://issues.apache.org/jira/browse/HIVE-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5893: - Status: Patch Available (was: Open) hive-schema-0.13.0.mysql.sql contains reference to nonexistent column - Key: HIVE-5893 URL: https://issues.apache.org/jira/browse/HIVE-5893 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Carl Steinbach Attachments: HIVE-5892.1.patch.txt -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4977) HS2: support an alternate resultset serialization format between client and server
[ https://issues.apache.org/jira/browse/HIVE-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4977: - Summary: HS2: support an alternate resultset serialization format between client and server (was: HS2: support an alternate serialization protocol between client and server) HS2: support an alternate resultset serialization format between client and server -- Key: HIVE-4977 URL: https://issues.apache.org/jira/browse/HIVE-4977 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Current serialization protocol between client and server as defined in cli_service.thrift results in 2x (or more) throughput degradation compared to HS1. Initial proposal is to introduce HS1 serialization protocol as a negotiable alternative. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820372#comment-13820372 ] Carl Steinbach commented on HIVE-5217: -- [~vaibhavgumashta] Is the patch ready for review? Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820574#comment-13820574 ] Carl Steinbach commented on HIVE-5217: -- I left some more comments on reviewboard. Are you planning to move the long polling timeout logic from SQLOperation.getState() to CLIService? Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819295#comment-13819295 ] Carl Steinbach commented on HIVE-5230: -- I left some comments on reviewboard. Thanks. Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5786: - Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Jason! Remove HadoopShims methods that were needed for pre-Hadoop 0.20 --- Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.13.0 Attachments: HIVE-5786.1.patch There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817609#comment-13817609 ] Carl Steinbach commented on HIVE-5783: -- [~jcoffey] I added you to the list of Hive contributors on JIRA. Feel free to assign this ticket to yourself. Thanks. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Priority: Minor Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817046#comment-13817046 ] Carl Steinbach commented on HIVE-5217: -- bq. It wouldn't be too hard to add support for a non long polling getOperationStatus call in future, specially a non rpc call. Adding a getOperationStatusInternal() or getOperationStatusNonLongPoll() method to work around this problem is exactly the sort of thing I want to avoid. This patch implements a service layer feature, so I think it makes sense that the implementation belongs in the service layer as well. In CLIService.getOperationStatus() you have access to the SessionManager, and from that you can easily get the Session object, the Operation object, the operation type, and the Session's configuration. As an added bonus you also get the OperationHandle which makes it easier to emit useful log messages that reference the session and operation IDs. Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817052#comment-13817052 ] Carl Steinbach commented on HIVE-5217: -- I posted the patch to reviewboard and left some comments: https://reviews.apache.org/r/15337/ Based on the test case included with the patch it looks like hive.server2.long.polling.timeout sets a lower bound on the total round trip time required to execute a getOperationStatus RPC, e.g. if polling.timeout = 5000 my query can finish after a second but getOperationStatus will still block for another four seconds before returning. Is this accurate? Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815012#comment-13815012 ] Carl Steinbach commented on HIVE-5217: -- [~vaibhavgumashta] Version 5 of the patch won't apply cleanly on trunk. Can you please rebase the patch and post a review request on RB? One concern I have with the patch is that the delay period affects everyone who calls SQLOperation.getStatus(). Ideally we would be able to limit this behavior to remote clients only. I'm worried that at some point in the future we're going to want to have a housekeeping or monitoring thread that periodically calls getStatus() on all active operations, and we don't want getStatus() to block in these types of situations. Would it be practical to relocate this logic to the CLIService layer so that it only impacts CLIService clients? Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Status: Open (was: Patch Available) m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch, HIVE-5711.3.patch, HIVE-5711.5.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Attachment: HIVE-5711.5.patch m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch, HIVE-5711.3.patch, HIVE-5711.5.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Status: Patch Available (was: Open) m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch, HIVE-5711.3.patch, HIVE-5711.5.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815092#comment-13815092 ] Carl Steinbach commented on HIVE-5711: -- [~brocknoland] Thanks for helping with this. I ran into some problems while testing this with M2Eclipse and have given up trying to make it work. Consequently, I removed the lifecycle-mapping clauses that I had added to the pluginManagement sections of various POMs. I have verified that the most recent version of the patch works with 'eclipse:eclipse'. m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch, HIVE-5711.3.patch, HIVE-5711.5.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) Fix eclipse:eclipse maven goal
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Summary: Fix eclipse:eclipse maven goal (was: m2eclipse does not work and eclipse:eclipse requires a manual fix) Fix eclipse:eclipse maven goal -- Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch, HIVE-5711.3.patch, HIVE-5711.5.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5731) Use new GenericUDF instead of basic UDF for UDFDate* classes
[ https://issues.apache.org/jira/browse/HIVE-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5731: - Status: Open (was: Patch Available) Use new GenericUDF instead of basic UDF for UDFDate* classes - Key: HIVE-5731 URL: https://issues.apache.org/jira/browse/HIVE-5731 Project: Hive Issue Type: Improvement Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5731.1.patch, HIVE-5731.2.patch, HIVE-5731.3.patch GenericUDF class is the latest and recommended base class for any UDFs. This JIRA is to change the current UDFDate* classes extended from GenericUDF. The general benefit of GenericUDF is described in comments as * The GenericUDF are superior to normal UDFs in the following ways: 1. It can accept arguments of complex types, and return complex types. 2. It can accept variable length of arguments. 3. It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts arrayint, arrayarrayint and so on (arbitrary levels of nesting). 4. It can do short-circuit evaluations using DeferedObject. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5731) Use new GenericUDF instead of basic UDF for UDFDate* classes
[ https://issues.apache.org/jira/browse/HIVE-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5731: - Status: Patch Available (was: Open) Use new GenericUDF instead of basic UDF for UDFDate* classes - Key: HIVE-5731 URL: https://issues.apache.org/jira/browse/HIVE-5731 Project: Hive Issue Type: Improvement Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5731.1.patch, HIVE-5731.2.patch, HIVE-5731.3.patch GenericUDF class is the latest and recommended base class for any UDFs. This JIRA is to change the current UDFDate* classes extended from GenericUDF. The general benefit of GenericUDF is described in comments as * The GenericUDF are superior to normal UDFs in the following ways: 1. It can accept arguments of complex types, and return complex types. 2. It can accept variable length of arguments. 3. It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts arrayint, arrayarrayint and so on (arbitrary levels of nesting). 4. It can do short-circuit evaluations using DeferedObject. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Assignee: Carl Steinbach Status: Patch Available (was: Open) m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Carl Steinbach Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Attachment: HIVE-5711.2.patch m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch, HIVE-5711.2.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Attachment: HIVE-5711.1.patch m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812650#comment-13812650 ] Carl Steinbach commented on HIVE-5711: -- Review request: https://reviews.apache.org/r/15203/ m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Priority: Critical Labels: Eclipse, Maven Attachments: HIVE-5711.1.patch As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810502#comment-13810502 ] Carl Steinbach commented on HIVE-5610: -- [~brocknoland] [~appodictic]: Thanks for all of your hard work! I'm +1 on merging the current version of the patch. [~brocknoland] Can you file a ticket to track the m2eclipse/shims issue? Thanks. Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch, HIVE-5610.6-for-commit.patch, HIVE-5610.6-for-reading.patch, HIVE-5610.6-maven.patch, Screen Shot 2013-10-30 at 11.42.03 PM.png With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810519#comment-13810519 ] Carl Steinbach commented on HIVE-5610: -- I'm fine with you committing the patch. Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch, HIVE-5610.6-for-commit.patch, HIVE-5610.6-for-reading.patch, HIVE-5610.6-maven.patch, Screen Shot 2013-10-30 at 11.42.03 PM.png With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5676) Cleanup test cases as done during mavenization
[ https://issues.apache.org/jira/browse/HIVE-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5676: - Component/s: Testing Infrastructure Build Infrastructure Cleanup test cases as done during mavenization -- Key: HIVE-5676 URL: https://issues.apache.org/jira/browse/HIVE-5676 Project: Hive Issue Type: Bug Components: Build Infrastructure, Testing Infrastructure Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5676.patch, HIVE-5676.patch A number of issues where found in HIVE-5107 and we plan on committing them directly to trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5610: - Labels: Maven (was: ) Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Brock Noland Labels: Maven Fix For: 0.13.0 Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch, HIVE-5610.6-for-commit.patch, HIVE-5610.6-for-reading.patch, HIVE-5610.6-maven.patch, Screen Shot 2013-10-30 at 11.42.03 PM.png With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5611: - Component/s: Build Infrastructure Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5611: - Labels: Maven (was: ) Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Component/s: Build Infrastructure m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Priority: Critical Labels: Eclipse, Maven As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5711) m2eclipse does not work and eclipse:eclipse requires a manual fix
[ https://issues.apache.org/jira/browse/HIVE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5711: - Labels: Eclipse Maven (was: ) m2eclipse does not work and eclipse:eclipse requires a manual fix - Key: HIVE-5711 URL: https://issues.apache.org/jira/browse/HIVE-5711 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Priority: Critical Labels: Eclipse, Maven As discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809855page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809855] m2eclipse doesn't work with the new maven change. Additionally as discussed [here|https://issues.apache.org/jira/browse/HIVE-5610?focusedCommentId=13809910page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13809910] eclipse:eclipse requires removing the hive-shims reference from all classpath files. We should figure out how to resolve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5610: - Component/s: Build Infrastructure Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Brock Noland Labels: Maven Fix For: 0.13.0 Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch, HIVE-5610.6-for-commit.patch, HIVE-5610.6-for-reading.patch, HIVE-5610.6-maven.patch, Screen Shot 2013-10-30 at 11.42.03 PM.png With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809579#comment-13809579 ] Carl Steinbach commented on HIVE-5610: -- [~brocknoland] A couple more notes (please excuse the bullet points): * The most recent 'for-commit' patch doesn't work work for me (common is unable to resolve the package o.a.h.hive.shims), but the 'maven' builds without problem. * 'mvn compile' fails because hive-service triggers a dependency on hive-exec-test. service/pom.xml correctly sets the scope to 'test' for this dependency, so I'm not sure why it's getting included during the compile stage. I tried debugging this by running 'mvn dependency:tree', but that fails with the same error. * maven-delete-ant.sh should be modified to remove */ivy.xml and eclipse-templates. Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809855#comment-13809855 ] Carl Steinbach commented on HIVE-5610: -- [~thejas] Instead of using Maven from the command line to generate the Eclipse artifacts I think we're supposed to import the code as an existing maven project with M2Eclipse. However, I encountered two problems with the M2Eclipse route: # For every subproject (with the exception of hive-shims) Eclipse complains that Project x is missing required Java project 'hive-shims' # Eclipse complains that Plugin execution not covered by lifecycle configuration: x I think (1) is related to the fact that hive-shims is an aggregator project (as opposed to a Java project). Apparently M2Eclipse doesn't support this use case: https://issues.sonatype.org/browse/MNGECLIPSE-2291 http://rgladwell.wordpress.com/2010/12/06/4-reasons-why-maven-nested-modules-suck/ (2) can be fixed by modifying the POMs: http://wiki.eclipse.org/M2E_plugin_execution_not_covered Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch, HIVE-5610.6-for-commit.patch, HIVE-5610.6-for-reading.patch, HIVE-5610.6-maven.patch With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808648#comment-13808648 ] Carl Steinbach commented on HIVE-5610: -- +1 Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch, HIVE-5610.4-for-commit.patch, HIVE-5610.4-for-reading.patch, HIVE-5610.4-maven.patch, HIVE-5610.5-for-commit.patch, HIVE-5610.5-for-reading.patch, HIVE-5610.5-maven.patch With HIVE-5566 complete we are ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Disable the precommit build 2) Apply patch 3) Commit result {noformat} svn status svn add .. svn commit -m HIVE-5610 - Merge maven branch into trunk (patch) {noformat} 4) Modify maven-rollforward.sh to use svn mv not mv: {noformat} perl -i -pe 's@^ mv @ svn mv @g' maven-rollforward.sh {noformat} 5) Execute maven-rollforward.sh and commit result {noformat} bash ./maven-rollforward.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (maven rollforward) {noformat} 6) Modify maven-delete-ant.sh to use svn rm as opposed to rm: {noformat} perl -i -pe 's@^ rm -rf @ svn rm @g' maven-delete-ant.sh {noformat} 7) Execute maven-delete-ant.sh and commit result {noformat} bash ./maven-delete-ant.sh svn status ... svn commit -m HIVE-5610 - Merge maven branch into trunk (delete ant) {noformat} 8) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} 9) Enable the precommit build h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807210#comment-13807210 ] Carl Steinbach commented on HIVE-5610: -- +1 to Ashutosh's suggestion. Also, are there junit or other test reports we can compare in order to verify that we aren't dropping any tests? Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5610.1-for-commit.patch, HIVE-5610.1-for-reading.patch, HIVE-5610.1-maven.patch, HIVE-5610.2-for-commit.patch, HIVE-5610.2-for-reading.patch, HIVE-5610.2-maven.patch With HIVE-5566 nearing completion we will be nearly ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom The merge process will be as follows: 1) Apply patch 2) Commit result 3) Modify the following line in maven-rollforward.sh: {noformat} mv $source $target {noformat} to {noformat} svn mv $source $target {noformat} 4) Execute maven-rollfward.sh and commit result 5) Modify the following line in maven-delete-ant.sh: {noformat} rm -rf $@ {noformat} to {noformat} svn rm $@ {noformat} 5) Execute maven-delete-ant.sh and commit result 6) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} h3. Notes: h4. On this jira I will upload three patches: {noformat} HIVE-5610.${VERSION}-for-reading.patch HIVE-5610.${VERSION}-for-commit.patch HIVE-5610.${VERSION}-maven.patch {noformat} * for-reading has no qfiles updates so it's easier to read * for-commit has the qfile updates and is for commit * maven is the patch in a rollfoward state for testing purposes h4. To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5610) Merge maven branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806236#comment-13806236 ] Carl Steinbach commented on HIVE-5610: -- Here are some issues I found: * When I remove the ~/.m2 directory 'mvn compile' fails with an unsatisfied dependency error. * There are a bunch of JAR artifacts with names that aren't prepended with hive-* * It would be nice if this patch removed the old Ant and Ivy files, eclipse-files directory, and anything else that it will make obsolete. How do I do the following: * Run the Thrift code generator. * Compile the Thrift C++ bindings in the ODBC directory. * Run a single TestCliDriver qfile test. Merge maven branch into trunk - Key: HIVE-5610 URL: https://issues.apache.org/jira/browse/HIVE-5610 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland With HIVE-5566 nearing completion we will be nearly ready to merge the maven branch to trunk. The following tasks will be done post-merge: * HIVE-5611 - Add assembly (i.e.) tar creation to pom * HIVE-5612 - Add ability to re-generate generated code stored in source control The merge process will be as follows: 1) svn merge ^/hive/branches/maven 2) Commit result 3) Modify the following line in maven-rollforward.sh: {noformat} mv $source $target {noformat} to {noformat} svn mv $source $target {noformat} 4) Execute maven-rollfward.sh 5) Commit result 6) Update trunk-mr1.properties and trunk-mr2.properties on the ptesting host, adding the following: {noformat} mavenEnvOpts = -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 testCasePropertyName = test buildTool = maven unitTests.directories = ./ {noformat} Notes: * To build everything you must: {noformat} $ mvn clean install -DskipTests $ cd itests $ mvn clean install -DskipTests {noformat} because itests (any tests that has cyclical dependencies or requires that the packages be built) is not part of the root reactor build. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5229) Better thread management for HiveServer2 async threads
[ https://issues.apache.org/jira/browse/HIVE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804995#comment-13804995 ] Carl Steinbach commented on HIVE-5229: -- [~vgumashta] Is the patch ready for review? Better thread management for HiveServer2 async threads -- Key: HIVE-5229 URL: https://issues.apache.org/jira/browse/HIVE-5229 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5229.1.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The async (background) thread pool currently creates N threads (server config), which are alive all the time. If all the threads in the pool are busy, a new request is added to a blocking queue. However, we can improve the strategy by not having all the async (background) threads alive when there are no corresponding requests. The async threads should die after a certain timeout if there are no new requests to handle. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802175#comment-13802175 ] Carl Steinbach commented on HIVE-5268: -- [~vaibhavgumashta] [~thiruvel] Thanks for tackling this problem. I took a quick look at the patch and have some thoughts I want to share. One of our design goals with HiveServer2 was to decouple session state from connection state, the motivation being that Hive queries can take a long time to complete, and you probably don't want your session/query to die if someone trips over your network cable. As a result every RPC contains either a logical session ID, or a logical operation ID. Because of this separation we can do interesting things like multiplex multiple sessions over the same physical connection or reference the same session over multiple physical connections. This property will also make it a lot easier to implement session migration between HiveServer2 instances. It looks like this patch creates a coupling between physical connection state and logical session state, and I think we should try to avoid doing this. I think we should try to view this issue as two separate problems: 1) making sure that Thrift resources (e.g. threads) are reclaimed when a client disconnects or times out due to inactivity, and 2) reclaiming resources associated with the session (excluding the connection state) when a session times out due to inactivity. Note that a connection timeout and session timeout are not linked, i.e. a connection timeout shouldn't trigger a session timeout and a session timeout shouldn't trigger a connection timeout. HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query -- Key: HIVE-5268 URL: https://issues.apache.org/jira/browse/HIVE-5268 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Vaibhav Gumashta Assignee: Thiruvel Thirumoolan Fix For: 0.13.0 Attachments: HIVE-5268_prototype.patch When queries are executed against the HiveServer2 an OperationHandle object is stored in the OperationManager.handleToOperation HashMap. Currently its the duty of the JDBC client to explicitly close to cleanup the entry in the map. But if the client fails to close the statement then the OperationHandle object is never cleaned up and gets accumulated in the server. This can potentially cause OOM on the server over time. This also can be used as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4629: - Status: Open (was: Patch Available) HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629-no_thrift.1.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13799672#comment-13799672 ] Carl Steinbach commented on HIVE-4629: -- [~brocknoland] Making the logs scrollable wasn't the point of my suggestion. I'm more concerned about leveraging the existing fetch functions and patterns to satisfy a use case which fundamentally looks very similar to fetching a query result set. CLIService is a public interface. We're permanently stuck with any changes that are made to it. I'd like to avoid cluttering it with a mishmash of logging RPCs if it's possible to reduce all of these use cases to a single pattern. In the future I hope we can finish discussing changes like this before the first patch is posted. [~shreepadma] I left some comments on reviewboard. Thanks. HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629-no_thrift.1.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5500) Update my username on credits page
[ https://issues.apache.org/jira/browse/HIVE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790641#comment-13790641 ] Carl Steinbach commented on HIVE-5500: -- +1 Update my username on credits page -- Key: HIVE-5500 URL: https://issues.apache.org/jira/browse/HIVE-5500 Project: Hive Issue Type: Task Reporter: Brock Noland Priority: Minor Attachments: HIVE-5500.patch My apache username is brock not brocknoland NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5489) NOTICE copyright dates are out of date, README needs update
[ https://issues.apache.org/jira/browse/HIVE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789765#comment-13789765 ] Carl Steinbach commented on HIVE-5489: -- +1. Please commit this to trunk and then backport to branch-0.12. Thanks. NOTICE copyright dates are out of date, README needs update --- Key: HIVE-5489 URL: https://issues.apache.org/jira/browse/HIVE-5489 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Blocker Attachments: HIVE-5489.1.patch This needs to be updated for 0.12 release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2436) Update project naming and description in Hive website
[ https://issues.apache.org/jira/browse/HIVE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787892#comment-13787892 ] Carl Steinbach commented on HIVE-2436: -- +1 Update project naming and description in Hive website - Key: HIVE-2436 URL: https://issues.apache.org/jira/browse/HIVE-2436 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Brock Noland Attachments: HIVE-2436.patch http://www.apache.org/foundation/marks/pmcs.html#naming -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5087: - Attachment: HIVE-5087-matchpath.2.patch Rebasing the old patch in response to some test and PTF changes. Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087-matchpath.2.patch, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5087: - Status: Patch Available (was: Open) Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.2.patch, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087-matchpath.2.patch, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5087: - Attachment: HIVE-5087.2.patch Using the right patch name format this time... Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.2.patch, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087-matchpath.2.patch, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5087: - Status: Open (was: Patch Available) Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.2.patch, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087-matchpath.2.patch, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785459#comment-13785459 ] Carl Steinbach commented on HIVE-5087: -- We are waiting until EOD Friday to commit this. If you are a Hive committer or PMC member and want more information about what's going on, then please send an email to the Hive PMC list. Thanks. Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781495#comment-13781495 ] Carl Steinbach commented on HIVE-5283: -- +1 the latest patch looks good to me. Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: alltypesorc, HIVE-5283.1.patch, HIVE-5283.2.patch, HIVE-5283.3.patch, HIVE-5283.4.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778099#comment-13778099 ] Carl Steinbach commented on HIVE-5087: -- [~thejas] I requested an update from our lawyer. Let's wait until EOD tomorrow to make a decision on this. Thanks. Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4957: - Status: Open (was: Patch Available) Comments on reviewboard. Thanks. Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan Attachments: HIVE-4957.1.patch normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771592#comment-13771592 ] Carl Steinbach commented on HIVE-5283: -- [~anthony.murphy] Thanks for the very detailed explanation. Clearly a lot of thought has gone into writing these tests. Now we just need to find a way of conveying this information to people down the road. I recommend doing the following: concatenate all of these tests into a single qfile named vectorization_short_regress.q and include the information from above in a comment at the top of the file. It would also be great if you could include a short comment per query so folks have an easy way of telling them apart. Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5283.1.patch, HIVE-5283.2.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771598#comment-13771598 ] Carl Steinbach commented on HIVE-5317: -- Will these features place any limitations on which storage formats you can use? Also, I don't think it's possible to support ACID guarantees and HCatalog (i.e. file permission based authorization) simultaneously on top of the same Hive warehouse. Is there a plan in place for fixing that? Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5283: - Status: Open (was: Patch Available) Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5283.1.patch, HIVE-5283.2.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767957#comment-13767957 ] Carl Steinbach commented on HIVE-5283: -- I added some more comments on RB. I wanted to note here since the site seems overwhelmed by the size of this patch and I have my doubts that they're actually going to get reposted here. Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5283.1.patch, HIVE-5283.2.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767983#comment-13767983 ] Carl Steinbach commented on HIVE-3585: -- I agree with what Ed said earlier and want to add that as a project we shouldn't put ourselves in the position of picking winners and losers when it comes battles between competing data serialization formats. As long as a patch like this meets the same code quality standards that we apply to every other patch I think it should get committed. Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5087) Rename npath UDF to matchpath
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767992#comment-13767992 ] Carl Steinbach commented on HIVE-5087: -- No one has threatened legal action over any of the other UDF names. If that happens I suppose we'll do the same thing. Rename npath UDF to matchpath - Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5253) Create component to compile and jar dynamic code
[ https://issues.apache.org/jira/browse/HIVE-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768054#comment-13768054 ] Carl Steinbach commented on HIVE-5253: -- Can you post a review request? Thanks. Create component to compile and jar dynamic code Key: HIVE-5253 URL: https://issues.apache.org/jira/browse/HIVE-5253 Project: Hive Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5253.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766303#comment-13766303 ] Carl Steinbach commented on HIVE-5283: -- [~jnp] Can you please post a review request on phabricator or reviewboard? Also, is the plan to commit this patch or to merge the vectorization branch into trunk? Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5283.1.patch, HIVE-5283.2.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4763) add support for thrift over http transport in HS2
[ https://issues.apache.org/jira/browse/HIVE-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763984#comment-13763984 ] Carl Steinbach commented on HIVE-4763: -- Cool. One additional note: the changes to SessionState look very suspect. I think the lazyInitForHttp variable is going to prevent us from running both BINARY and HTTP modes at the same time, and I don't understand why the modifications to SessionState.get() were necessary. add support for thrift over http transport in HS2 - Key: HIVE-4763 URL: https://issues.apache.org/jira/browse/HIVE-4763 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Fix For: 0.12.0 Attachments: HIVE-4763.1.patch, HIVE-4763.2.patch, HIVE-4763.D12855.1.patch Subtask for adding support for http transport mode for thrift api in hive server2. Support for the different authentication modes will be part of another subtask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5107) Change hive's build to maven
[ https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761213#comment-13761213 ] Carl Steinbach commented on HIVE-5107: -- [~brocknoland] Do you have any theories about why your mavenized build completes in 1/5 the amount of time required by the current build? Change hive's build to maven Key: HIVE-5107 URL: https://issues.apache.org/jira/browse/HIVE-5107 Project: Hive Issue Type: Task Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: HIVE-5107-wip.patch I can not cope with hive's build infrastructure any more. I have started working on porting the project to maven. When I have some solid progess i will github the entire thing for review. Then we can talk about switching the project somehow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5087) Rename npath UDF
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761572#comment-13761572 ] Carl Steinbach commented on HIVE-5087: -- [~appodictic] I think the thread that you're referring to was on the private list, so Harish didn't see it. It's my fault that this information wasn't communicated to him earlier, and that I solicited his opinion at such a late stage in the game. Still, I think we should defer to him in this matter since he was the primary implementor of this feature. I have attached a new patch that changes the name of the function to matchpath and updates the various testcases. If the tests pass and someone else +1s it I will take care of getting it committed to trunk and backported to branch-0.12. Rename npath UDF Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5087) Rename npath UDF
[ https://issues.apache.org/jira/browse/HIVE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-5087: - Attachment: HIVE-5087-matchpath.1.patch.txt Rename npath UDF Key: HIVE-5087 URL: https://issues.apache.org/jira/browse/HIVE-5087 Project: Hive Issue Type: Bug Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Blocker Fix For: 0.12.0 Attachments: HIVE-5087.1.patch.txt, HIVE-5087.99.patch.txt, HIVE-5087-matchpath.1.patch.txt, HIVE-5087.patch.txt, HIVE-5087.patch.txt, regex_path.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira