[jira] [Commented] (HIVE-6784) parquet-hive should allow column type change
[ https://issues.apache.org/jira/browse/HIVE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966341#comment-13966341 ] Justin Coffey commented on HIVE-6784: - You've cited a lazy serde. Parquet is not lazy. It is similar to ORC. Have a look ORC's deserialize() method (org.apache.hadoop.hive.ql.io.orc.OrcSerde): {code} @Override public Object deserialize(Writable writable) throws SerDeException { return writable; } {code} A quick look through ORC code indicates to me that they don't do any reparsing (though I might have missed something). Looking through other serde's not a single one (that I checked) reparses values. Value parsing is handled in ObjectInspectors (poke around org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils). In my opinion, the *substantial* performance penalty that you are introducing with this patch is going to be a much bigger negative to adopting parquet than obliging people to rebuild their data set in the rare event that you have to change a type. And if you do need to change a type, insert overwrite table is a good work around. -1 parquet-hive should allow column type change Key: HIVE-6784 URL: https://issues.apache.org/jira/browse/HIVE-6784 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6784.1.patch.txt, HIVE-6784.2.patch.txt see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/323 Currently, if we change parquet format hive table using alter table parquet_table change c1 c1 bigint ( assuming original type of c1 is int), it will result in exception thrown from SerDe: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable in query runtime. This is different behavior from hive (using other file format), where it will try to perform cast (null value in case of incompatible type). Parquet Hive's RecordReader returns an ArrayWritable (based on schema stored in footers of parquet files); ParquetHiveSerDe also creates an corresponding ArrayWritableObjectInspector (but using column type info from metastore). Whenever there is column type change, the objector inspector will throw exception, since WritableLongObjectInspector cannot inspect an IntWritable etc... Conversion has to happen somewhere if we want to allow type change. SerDe's deserialize method seems a natural place for it. Currently, serialize method calls createStruct (then createPrimitive) for every record, but it creates a new object regardless, which seems expensive. I think that could be optimized a bit by just returning the object passed if already of the right type. deserialize also reuse this method, if there is a type change, there will be new object to be created, which I think is inevitable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data
[ https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966352#comment-13966352 ] Szehon Ho commented on HIVE-6131: - Yea thats what I meant, 'alter table partition (spec) add column', etc, I wonder why its not natural? Imo it would gives more flexibility to user. My concern with your approach is it locking user to have only one choice per Serde. For example take Rcfile's serdes, which would you pick? If it is 'table-schema' , then you hit the partition..11.q issue (error after column-type change). If it is 'partition-schema', then you hit the JIRA's issue (add column, new data loaded in that column is null) because partition schema is never updated. There might be users interested in both cases (we ourselves are interested to get the latter use case for RCFile). New columns after table alter result in null values despite data Key: HIVE-6131 URL: https://issues.apache.org/jira/browse/HIVE-6131 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: James Vaughan Priority: Minor Attachments: HIVE-6131.1.patch Hi folks, I found and verified a bug on our CDH 4.0.3 install of Hive when adding columns to tables with Partitions using 'REPLACE COLUMNS'. I dug through the Jira a little bit and didn't see anything for it so hopefully this isn't just noise on the radar. Basically, when you alter a table with partitions and then reupload data to that partition, it doesn't seem to recognize the extra data that actually exists in HDFS- as in, returns NULL values on the new column despite having the data and recognizing the new column in the metadata. Here's some steps to reproduce using a basic table: 1. Run this hive command: CREATE TABLE jvaughan_test (col1 string) partitioned by (day string); 2. Create a simple file on the system with a couple of entries, something like hi and hi2 separated by newlines. 3. Run this hive command, pointing it at the file: LOAD DATA LOCAL INPATH 'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02'); 4. Confirm the data with: SELECT * FROM jvaughan_test WHERE day = '2014-01-02'; 5. Alter the column definitions: ALTER TABLE jvaughan_test REPLACE COLUMNS (col1 string, col2 string); 6. Edit your file and add a second column using the default separator (ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the first row and hi4 on the second 7. Run step 3 again 8. Check the data again like in step 4 For me, this is the results that get returned: hive select * from jvaughan_test where day = '2014-01-01'; OK hiNULL2014-01-02 hi2 NULL2014-01-02 This is despite the fact that there is data in the file stored by the partition in HDFS. Let me know if you need any other information. The only workaround for me currently is to drop partitions for any I'm replacing data in and THEN reupload the new data file. Thanks, -James -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6873) DISTINCT clause in aggregates is handled incorrectly by vectorized execution
[ https://issues.apache.org/jira/browse/HIVE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966357#comment-13966357 ] Remus Rusanu commented on HIVE-6873: Thanks for finding the problem repro and fix! DISTINCT clause in aggregates is handled incorrectly by vectorized execution Key: HIVE-6873 URL: https://issues.apache.org/jira/browse/HIVE-6873 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0, 0.14.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-6873.1.patch, HIVE-6873.2.patch, HIVE-6873.3.patch The vectorized aggregates ignore the DISTINCT clause. This cause incorrect results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall aggregate keys the vectorized aggregates do account for the extra key, but they do not process the data correctly for the key. the reduce side the aggregates the input from the vectorized map side to results that are only sometimes correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm filing a bug to disable vectorized execution if DISTINCT is present. Fix is trivial. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966355#comment-13966355 ] Szehon Ho commented on HIVE-6785: - Hi Tonjie, these are deprecated now and will be removed. See the discussion on HIVE-6757, for the current state. Use 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe', 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat', 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6873) DISTINCT clause in aggregates is handled incorrectly by vectorized execution
[ https://issues.apache.org/jira/browse/HIVE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6873: --- Attachment: image004.png image003.png image002.png image001.png set hive.map.groupby.sorted=true; was the missing part for me… From: Jitendra Nath Pandey (JIRA) [mailto:j...@apache.org] Sent: Friday, April 11, 2014 1:02 AM To: Remus Rusanu Subject: [jira] [Commented] (HIVE-6873) DISTINCT clause in aggregates is handled incorrectly by vectorized execution [cid:image001.png@01CF557F.FDF58F90] Jitendra Nath Pandeyhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=jnp commented on [Bug] HIVE-6873https://issues.apache.org/jira/browse/HIVE-6873 Re: DISTINCT clause in aggregates is handled incorrectly by vectorized executionhttps://issues.apache.org/jira/browse/HIVE-6873 Here is a scenario where we get incorrect result. It shows up on sorted bucketed column with hive.map.groupby.sorted=true, and only on group by queries with no keys. Here are the steps: hive Create table T(a int, b int) clustered by (a) sorted by (a) stored as orc; load following data: 300 1 300 1 300 1 300 1 300 1 hive set hive.map.groupby.sorted=true; hive select sum(distinct a) from T; // Incorrect result. hive select count(distinct a) from T; // This is also incorrect. [Add Comment]https://issues.apache.org/jira/browse/HIVE-6873#add-comment Add Commenthttps://issues.apache.org/jira/browse/HIVE-6873#add-comment This message was sent by Atlassian JIRA (v6.2#6252-sha1:aa34325) [Atlassian logo] DISTINCT clause in aggregates is handled incorrectly by vectorized execution Key: HIVE-6873 URL: https://issues.apache.org/jira/browse/HIVE-6873 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0, 0.14.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-6873.1.patch, HIVE-6873.2.patch, HIVE-6873.3.patch, image001.png, image002.png, image003.png, image004.png The vectorized aggregates ignore the DISTINCT clause. This cause incorrect results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall aggregate keys the vectorized aggregates do account for the extra key, but they do not process the data correctly for the key. the reduce side the aggregates the input from the vectorized map side to results that are only sometimes correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm filing a bug to disable vectorized execution if DISTINCT is present. Fix is trivial. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6883) Dynamic partitioning optimization does not honor sort order or order by
[ https://issues.apache.org/jira/browse/HIVE-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966364#comment-13966364 ] Hive QA commented on HIVE-6883: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639717/HIVE-6883.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5613 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 org.apache.hive.jdbc.TestJdbcDriver2.testNewConnectionConfiguration {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2218/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2218/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12639717 Dynamic partitioning optimization does not honor sort order or order by --- Key: HIVE-6883 URL: https://issues.apache.org/jira/browse/HIVE-6883 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Fix For: 0.13.0 Attachments: HIVE-6883.1.patch, HIVE-6883.2.patch, HIVE-6883.3.patch HIVE-6455 patch does not honor sort order of the output table or order by of select statement. The reason for the former is numDistributionKey in ReduceSinkDesc is set wrongly. It doesn't take into account the sort columns, because of this RSOp sets the sort columns to null in Key. Since nulls are set in place of sort columns in Key, the sort columns in Value are not sorted. The other issue is ORDER BY columns are not honored during insertion. For example {code} insert overwrite table over1k_part_orc partition(ds=foo, t) select si,i,b,f,t from over1k_orc where t is null or t=27 order by si; {code} the select query performs order by on column 'si' in the first MR job. The following MR job (inserted by HIVE-6455), sorts the input data on dynamic partition column 't' without taking into account the already sorted 'si' column. This results in out of order insertion for 'si' column. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6883) Dynamic partitioning optimization does not honor sort order or order by
[ https://issues.apache.org/jira/browse/HIVE-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966368#comment-13966368 ] Prasanth J commented on HIVE-6883: -- The tests are not related. Dynamic partitioning optimization does not honor sort order or order by --- Key: HIVE-6883 URL: https://issues.apache.org/jira/browse/HIVE-6883 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Fix For: 0.13.0 Attachments: HIVE-6883.1.patch, HIVE-6883.2.patch, HIVE-6883.3.patch HIVE-6455 patch does not honor sort order of the output table or order by of select statement. The reason for the former is numDistributionKey in ReduceSinkDesc is set wrongly. It doesn't take into account the sort columns, because of this RSOp sets the sort columns to null in Key. Since nulls are set in place of sort columns in Key, the sort columns in Value are not sorted. The other issue is ORDER BY columns are not honored during insertion. For example {code} insert overwrite table over1k_part_orc partition(ds=foo, t) select si,i,b,f,t from over1k_orc where t is null or t=27 order by si; {code} the select query performs order by on column 'si' in the first MR job. The following MR job (inserted by HIVE-6455), sorts the input data on dynamic partition column 't' without taking into account the already sorted 'si' column. This results in out of order insertion for 'si' column. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6868) Create table in HCatalog sets different SerDe defaults than what is set through the CLI
[ https://issues.apache.org/jira/browse/HIVE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966428#comment-13966428 ] Hive QA commented on HIVE-6868: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639720/HIVE-6868.3.patch {color:green}SUCCESS:{color} +1 5585 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2219/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2219/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12639720 Create table in HCatalog sets different SerDe defaults than what is set through the CLI --- Key: HIVE-6868 URL: https://issues.apache.org/jira/browse/HIVE-6868 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Harish Butani Attachments: HIVE-6868.1.patch, HIVE-6868.2.patch, HIVE-6868.3.patch HCatCreateTableDesc doesn't invoke the getEmptyTable function on org.apache.hadoop.hive.ql.metadata.Table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6887) Add missing params to hive-default.xml.template
[ https://issues.apache.org/jira/browse/HIVE-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani resolved HIVE-6887. - Resolution: Fixed Committed to trunk and 0.13 Lefty thanks for your help with this issue. Add missing params to hive-default.xml.template Key: HIVE-6887 URL: https://issues.apache.org/jira/browse/HIVE-6887 Project: Hive Issue Type: Bug Reporter: Harish Butani Attachments: HIVE-6887.1.patch Add in the ones that were added to HiveConf, but not the template.xml file; For 0.13 we will not be moving to HIVE-6037 style of genning the template file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6873) DISTINCT clause in aggregates is handled incorrectly by vectorized execution
[ https://issues.apache.org/jira/browse/HIVE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6873: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk and 0.13 Thanks Jitendra, Remus and Ashutosh DISTINCT clause in aggregates is handled incorrectly by vectorized execution Key: HIVE-6873 URL: https://issues.apache.org/jira/browse/HIVE-6873 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0, 0.14.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-6873.1.patch, HIVE-6873.2.patch, HIVE-6873.3.patch, image001.png, image002.png, image003.png, image004.png The vectorized aggregates ignore the DISTINCT clause. This cause incorrect results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall aggregate keys the vectorized aggregates do account for the extra key, but they do not process the data correctly for the key. the reduce side the aggregates the input from the vectorized map side to results that are only sometimes correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm filing a bug to disable vectorized execution if DISTINCT is present. Fix is trivial. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6732) Update Release Notes for Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-6732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6732: Attachment: HIVE-6732.3.patch Update Release Notes for Hive 0.13 -- Key: HIVE-6732 URL: https://issues.apache.org/jira/browse/HIVE-6732 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6732.1.patch, HIVE-6732.2.patch, HIVE-6732.3.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6895) Beeline -f Requires A New Line
Jesse Anderson created HIVE-6895: Summary: Beeline -f Requires A New Line Key: HIVE-6895 URL: https://issues.apache.org/jira/browse/HIVE-6895 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Jesse Anderson When a file is run through Beeline, the file must contain a newline at the end of the file or the command will not run. The Beeline shell seems to disconnect without running the command. Here is the command without a newline: {noformat} [training@dev solution]$ beeline -u jdbc:hive2://dev.loudacre.com -n training -p training -f create-devicestatus-table.hql scan complete in 7ms Connecting to jdbc:hive2://dev.loudacre.com Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-cdh4.5.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.10.0-cdh4.5.0 by Apache Hive 0: jdbc:hive2://dev.loudacre.com DROP TABLE IF EXISTS devicestatus;Closing: org.apache.hive.jdbc.HiveConnection {noformat} Here is the same command with a newline: {noformat} [training@dev solution]$ beeline -u jdbc:hive2://dev.loudacre.com -n training -p training -f create-devicestatus-table.hql scan complete in 8ms Connecting to jdbc:hive2://dev.loudacre.com Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-cdh4.5.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.10.0-cdh4.5.0 by Apache Hive 0: jdbc:hive2://dev.loudacre.com DROP TABLE IF EXISTS devicestatus; No rows affected (0.222 seconds) 0: jdbc:hive2://dev.loudacre.com Closing: org.apache.hive.jdbc.HiveConnection {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966752#comment-13966752 ] Shuaishuai Nie commented on HIVE-5072: -- Thanks [~ekoifman] for the comments. Please see below for the answers: 0. If I understand this correctly, optionsFile should contain the details of Sqoop command to execute. But in the code it seems that the expectation is that this file is present in DFS. Thus to submit a Sqoop job via WebHcat (and use optionsFile) the user has to first upload this file to the cluster. This is an extra call for job submission and possibly extra config on the cluster side to enable the client of WebHCat to upload files. Why not just let the client upload the file WebHCat as part of the REST POST request? This seems a lot more user friendly/usable. The user scenario for the option file is user may want to reuse some part of sqoop command arguments across different commands, like connection string; username or password. In this case, user should expect the file already exist on DFS for the to use across different jobs. Since Sqoop only support option file from local file system, and Templeon may launch sqoop job on any workernode, Templeton need to add the option file used to distribute cache so that it can be used in Sqoop command. You mentioned Why not just let the client upload the file WebHCat as part of the REST POST request, where the file located originally? If it is from local file system, it will require extra copy and extra command for each Templeton Sqoop job. 1. -d 'user.name=foo' is deprecated (i.e. user.name as a Form parameter). user.name has to be part of the query string. The test cases and examples in .pdf should be updated. 2. Formatting in ScoopDelegator doesn't follow Hive conventions 3. Server.scoop() - there is Server.checkEnableLogPrerequisite() to check 'enableLog' parameter setting. 4. I see that new parameters for Scoop tests are added in 3 places in build.xml. Only the 'test' target actually runs jobsubmission.conf. Will change the patch and documentation accordingly for 1-4 5. For the tests you added, where does the JDBC driver come from for any particular DB? The JDBC drive should come from the Sqoop installation based on which Database is used. It should located at %SQOOP_HOME%\lib folder 6. Can for Form parameter for optionsFile (Server.sqoop()) be called optionsFile instead of just file? The file argument does not works exactly the same as the --options-file in Sqoop since the --options-file can be only part of the command and file here can only be the entire command. But I think change the name to optionFile may be more explanatory for users. 7. it seems from http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_using_options_files_to_pass_arguments that in a Sqoop command, either options-file (with command and args) or command name and all args inline can be specifed. The tests you added seem to expect only command args to be in options-file. In particular Server.sqoop() tests command == null optionsFile == null but not if both options are specified. Seems like this is not expected usage. As I mentioned earlier, the optionsFile here in Server.sqoop() is not exactly works the same as the --options-file in Sqoop. The use of --options-file from Sqoop is tested in the second e2e test for Sqoop. In that test, the --option-file substitute part of the Sqoop command. The Templeton-Sqoop option should not allow both command and optionsFile to be defined since the optionsFile here supposed to be used as the entire Sqoop command. I will add the condition check for this scenario. 8. Is there anything that can be done to make the test self-contained, so that the DB table is automatically created, for example in the DB that contains the metastore data? There is not an efficient way to make the test self-contained given any database may be used for the test and even for the metastore the type of database can be different. [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: WebHCat Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6887) Add missing params to hive-default.xml.template
[ https://issues.apache.org/jira/browse/HIVE-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6887: Fix Version/s: 0.13.0 Add missing params to hive-default.xml.template Key: HIVE-6887 URL: https://issues.apache.org/jira/browse/HIVE-6887 Project: Hive Issue Type: Bug Reporter: Harish Butani Fix For: 0.13.0 Attachments: HIVE-6887.1.patch Add in the ones that were added to HiveConf, but not the template.xml file; For 0.13 we will not be moving to HIVE-6037 style of genning the template file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6873) DISTINCT clause in aggregates is handled incorrectly by vectorized execution
[ https://issues.apache.org/jira/browse/HIVE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6873: Fix Version/s: 0.13.0 DISTINCT clause in aggregates is handled incorrectly by vectorized execution Key: HIVE-6873 URL: https://issues.apache.org/jira/browse/HIVE-6873 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0, 0.14.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 0.13.0 Attachments: HIVE-6873.1.patch, HIVE-6873.2.patch, HIVE-6873.3.patch, image001.png, image002.png, image003.png, image004.png The vectorized aggregates ignore the DISTINCT clause. This cause incorrect results. Due to how GroupByOperatorDesc adds the DISTINCT keys to the overall aggregate keys the vectorized aggregates do account for the extra key, but they do not process the data correctly for the key. the reduce side the aggregates the input from the vectorized map side to results that are only sometimes correct but mostly incorrect. HIVE-4607 tracks the proper fix, but meantime I'm filing a bug to disable vectorized execution if DISTINCT is present. Fix is trivial. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6732) Update Release Notes for Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-6732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6732: Attachment: (was: HIVE-6732.3.patch) Update Release Notes for Hive 0.13 -- Key: HIVE-6732 URL: https://issues.apache.org/jira/browse/HIVE-6732 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6732.1.patch, HIVE-6732.2.patch, HIVE-6732.3.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6732) Update Release Notes for Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-6732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6732: Attachment: HIVE-6732.3.patch Update Release Notes for Hive 0.13 -- Key: HIVE-6732 URL: https://issues.apache.org/jira/browse/HIVE-6732 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6732.1.patch, HIVE-6732.2.patch, HIVE-6732.3.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
I am starting the process to cut an rc. No more commits to the 0.13 branch. regards, Harish. On Apr 10, 2014, at 11:53 AM, Harish Butani hbut...@hortonworks.com wrote: Lefty, I added in the missing ones and have a patch, jira is HIVE-6887: - The ones from 6500 and 6466 are already attached. - I got the comments for 6447 from Vaibhav. - Applied the patch from HIVE-6503. I think it got closed as a Duplicate. Sorry if it looks like I am rushing you. Just trying to help. Please review or replace with your patch. Will wait for your response. We are most likely down to this issue. Would love to cut the rc today/early tomorrow. regards, Harish. On Apr 10, 2014, at 10:41 AM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: In fact, Harish just pointed out, the SSL configs are not there in hive template. Thanks for pointing out Lefty! On Thu, Apr 10, 2014 at 1:21 PM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Hi Lefty, All the HiveServer2 related configs are already in hive-site.xml.template and HiveConf.java. Thanks, --Vaibhav On Thu, Apr 10, 2014 at 7:54 AM, Lefty Leverenz leftylever...@gmail.comwrote: Harish, here are some additions to your list, with links and patch excerpts: HIVE-5351 https://issues.apache.org/jira/browse/HIVE-5351 (linked doc jira HIVE-6318 https://issues.apache.org/jira/browse/HIVE-6318 doesn't provide definitions for template file but documents these config in the wiki -- Setting Up HiveServer2 - SSL Encryption https://cwiki.apache.org/confluence/display/Hive/Setting+up+HiveServer2#SettingUpHiveServer2-SSLEncryption ): +HIVE_SERVER2_USE_SSL(hive.server2.use.SSL, false), +HIVE_SERVER2_SSL_KEYSTORE_PATH(hive.server2.keystore.path, ), +HIVE_SERVER2_SSL_KEYSTORE_PASSWORD(hive.server2.keystore.password, ), HIVE-6447 (on your list): HIVE_CONVERT_JOIN_BUCKET_MAPJOIN_TEZ(hive.convert.join.bucket.mapjoin.tez, false), description provided in jira comment https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959765#comment-13959765 : {code} property namehive.convert.join.bucket.mapjoin.tez/name valuefalse/value descriptionWhether joins can be automatically converted to bucket map joins in hive when tez is used as the execution engine./description /property {code} HIVE-6500 https://issues.apache.org/jira/browse/HIVE-6500: HIVESTATSDBCLASS(hive.stats.dbclass, fs, new PatternValidator(jdbc(:.*), hbase, counter, custom, fs)), // StatsSetupConst.StatDB *Need to add fs to template description:* namehive.stats.dbclass/name valuecounter/value descriptionThe storage that stores temporary Hive statistics. Currently, jdbc, hbase, counter and custom type are supported./description HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 added a config value (PAM) and a new config (hive.server2.authentication.pam.services): HIVE_SERVER2_AUTHENTICATION(hive.server2.authentication, NONE, -new StringsValidator(NOSASL, NONE, LDAP, KERBEROS, CUSTOM)), +new StringsValidator(NOSASL, NONE, LDAP, KERBEROS, PAM, CUSTOM)), ... +// List of the underlying pam services that should be used when auth type is PAM +// A file with the same name must exist in /etc/pam.d +HIVE_SERVER2_PAM_SERVICES(hive.server2.authentication.pam.services, null), It's documented in the wiki by HIVE-6318https://issues.apache.org/jira/browse/HIVE-6318in Setting Up HiveServer2 - Pluggable Authentication Modules (PAM) https://cwiki.apache.org/confluence/display/Hive/Setting+up+HiveServer2#SettingUpHiveServer2-PluggableAuthenticationModules(PAM) and supposedly documented in the template file by HIVE-6503 https://issues.apache.org/jira/browse/HIVE-6503, which says committed for 0.13.0 but *doesn't show up in branch 13 or trunk.* HIVE-6503.1.patch https://issues.apache.org/jira/secure/attachment/12633674/HIVE-6503.1.patch : @@ -2165,6 +2165,7 @@ NONE: no authentication check LDAP: LDAP/AD based authentication KERBEROS: Kerberos/GSSAPI authentication + PAM: Pluggable authentication module CUSTOM: Custom authentication provider (Use with property hive.server2.custom.authentication.class) /description @@ -2217,6 +2218,15 @@ /property property + namehive.server2.authentication.pam.services/name + value/value + description +List of the underlying PAM services that should be used when auth type is PAM. +A file with the same name must exist in /etc/pam.d. + /description +/property + +property namehive.server2.enable.doAs/name valuetrue/value description HIVE-6681 https://issues.apache.org/jira/browse/HIVE-6681: +
[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongjie Chen updated HIVE-6785: --- Attachment: HIVE-6785.3.patch replace deprecated parquet class. query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, HIVE-6785.3.patch When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6896) Set tez credential file property along with MR conf property for Pig jobs
Eugene Koifman created HIVE-6896: Summary: Set tez credential file property along with MR conf property for Pig jobs Key: HIVE-6896 URL: https://issues.apache.org/jira/browse/HIVE-6896 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 webhcat should set the additional property - tez.credentials.path to the same value as the MapReduce property. WebHCat should always proactively set this tez.credentials.path property to the same value and in the same cases where it is setting the MR equivalent property. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6896) Set tez credential file property along with MR conf property for Pig jobs
[ https://issues.apache.org/jira/browse/HIVE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6896: - Fix Version/s: (was: 0.13.0) Set tez credential file property along with MR conf property for Pig jobs - Key: HIVE-6896 URL: https://issues.apache.org/jira/browse/HIVE-6896 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman webhcat should set the additional property - tez.credentials.path to the same value as the MapReduce property. WebHCat should always proactively set this tez.credentials.path property to the same value and in the same cases where it is setting the MR equivalent property. NO PRECOMMIT TESTS HIVE-6780 made the change in HiveDelegator. The same needs to be done in PigDelegator when Pig (0.13?) adds support for Tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6896) Set tez credential file property along with MR conf property for Pig jobs
[ https://issues.apache.org/jira/browse/HIVE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6896: - Description: webhcat should set the additional property - tez.credentials.path to the same value as the MapReduce property. WebHCat should always proactively set this tez.credentials.path property to the same value and in the same cases where it is setting the MR equivalent property. NO PRECOMMIT TESTS HIVE-6780 made the change in HiveDelegator. The same needs to be done in PigDelegator when Pig (0.13?) adds support for Tez. was: webhcat should set the additional property - tez.credentials.path to the same value as the MapReduce property. WebHCat should always proactively set this tez.credentials.path property to the same value and in the same cases where it is setting the MR equivalent property. NO PRECOMMIT TESTS Set tez credential file property along with MR conf property for Pig jobs - Key: HIVE-6896 URL: https://issues.apache.org/jira/browse/HIVE-6896 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman webhcat should set the additional property - tez.credentials.path to the same value as the MapReduce property. WebHCat should always proactively set this tez.credentials.path property to the same value and in the same cases where it is setting the MR equivalent property. NO PRECOMMIT TESTS HIVE-6780 made the change in HiveDelegator. The same needs to be done in PigDelegator when Pig (0.13?) adds support for Tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966958#comment-13966958 ] Xuefu Zhang commented on HIVE-5847: --- +1 DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5847: -- Attachment: HIVE-5847.1.patch Reattach the same patch to re-run test, since the patch has been there for quite a while. DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch, HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6890) Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side
[ https://issues.apache.org/jira/browse/HIVE-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966966#comment-13966966 ] Roshan Naik commented on HIVE-6890: --- Test failure is unrelated to patch. Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side - Key: HIVE-6890 URL: https://issues.apache.org/jira/browse/HIVE-6890 Project: Hive Issue Type: Bug Reporter: Roshan Naik Assignee: Roshan Naik Attachments: HIVE-6890.patch Incorrect conf object being passed to MetaStore client in AbstractRecordWriter is causing the issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
[ https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966979#comment-13966979 ] Ashutosh Chauhan commented on HIVE-6862: [~eugene.koifman] All other DBs also have README providing instructions on how to effect upgrade. It will be good to have that for MS SQL as well. add DB schema DDL and upgrade 12to13 scripts for MS SQL Server -- Key: HIVE-6862 URL: https://issues.apache.org/jira/browse/HIVE-6862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6862.patch need to add a unifed 0.13 script and a separate script for ACID support NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data
[ https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967001#comment-13967001 ] Pala M Muthaia commented on HIVE-6131: -- You are right, types of existing columns may change so partition schema may never be same as table schema, so cannot pick one or the other. Let's say we support add columns DDL at partition level. What can be allowed? Can users add arbitrarily different columns compared to table, or should they only add columns that are present in table level, but are missing at partition level, in the same order? e.g: Initial schema: Table t (A, B, C, D), Partition p (A', B'). Can users only execute 'Alter table t partition (p) add columns C,D'? Or can they do something else also 'alter table t partition (p) add columns E, F, G'? If it is only the former, then we still can do the same programmatically, by 'merging' the partition and table schema at runtime. However, if the table schema itself can be wildly different compared to partition schema, then yes, DDL is the only option, and users have to manage it themselves. New columns after table alter result in null values despite data Key: HIVE-6131 URL: https://issues.apache.org/jira/browse/HIVE-6131 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: James Vaughan Priority: Minor Attachments: HIVE-6131.1.patch Hi folks, I found and verified a bug on our CDH 4.0.3 install of Hive when adding columns to tables with Partitions using 'REPLACE COLUMNS'. I dug through the Jira a little bit and didn't see anything for it so hopefully this isn't just noise on the radar. Basically, when you alter a table with partitions and then reupload data to that partition, it doesn't seem to recognize the extra data that actually exists in HDFS- as in, returns NULL values on the new column despite having the data and recognizing the new column in the metadata. Here's some steps to reproduce using a basic table: 1. Run this hive command: CREATE TABLE jvaughan_test (col1 string) partitioned by (day string); 2. Create a simple file on the system with a couple of entries, something like hi and hi2 separated by newlines. 3. Run this hive command, pointing it at the file: LOAD DATA LOCAL INPATH 'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02'); 4. Confirm the data with: SELECT * FROM jvaughan_test WHERE day = '2014-01-02'; 5. Alter the column definitions: ALTER TABLE jvaughan_test REPLACE COLUMNS (col1 string, col2 string); 6. Edit your file and add a second column using the default separator (ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the first row and hi4 on the second 7. Run step 3 again 8. Check the data again like in step 4 For me, this is the results that get returned: hive select * from jvaughan_test where day = '2014-01-01'; OK hiNULL2014-01-02 hi2 NULL2014-01-02 This is despite the fact that there is data in the file stored by the partition in HDFS. Let me know if you need any other information. The only workaround for me currently is to drop partitions for any I'm replacing data in and THEN reupload the new data file. Thanks, -James -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6732) Update Release Notes for Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-6732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani resolved HIVE-6732. - Resolution: Fixed Committed to trunk and 0.13 Update Release Notes for Hive 0.13 -- Key: HIVE-6732 URL: https://issues.apache.org/jira/browse/HIVE-6732 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6732.1.patch, HIVE-6732.2.patch, HIVE-6732.3.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6897) Allow overwrite/append to external Hive table (with partitions) via HCatStorer
Dip Kharod created HIVE-6897: Summary: Allow overwrite/append to external Hive table (with partitions) via HCatStorer Key: HIVE-6897 URL: https://issues.apache.org/jira/browse/HIVE-6897 Project: Hive Issue Type: Improvement Components: HCatalog, HiveServer2 Affects Versions: 0.12.0 Reporter: Dip Kharod I'm using HCatStorer to write to external Hive table with partition from Pig and have the following different use cases: 1) Need to overwrite (aka, refresh) data into table: Currently I end up doing this outside (drop partition and delete HDFS folder) of Pig which is very painful and error-prone 2) Need to append (aka, add new file) data to the Hive external table/partition: Again, I end up doing this outside of Pig by copying file in appropriate folder It would be very productive for the developers to have both options in HCatStorer. -- This message was sent by Atlassian JIRA (v6.2#6252)
test: ignore eom
-- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967047#comment-13967047 ] Hive QA commented on HIVE-6785: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639834/HIVE-6785.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5615 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2221/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2221/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12639834 query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, HIVE-6785.3.patch When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[VOTE] Apache Hive 0.13.0 Release Candidate 0
Apache Hive 0.13.0 Release Candidate 0 is available here: http://people.apache.org/~rhbutani/hive-0.13.0-candidate-0 Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1008 Source tag for RCN is at: https://svn.apache.org/repos/asf/hive/tags/release-0.13.0-rc0/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks.
[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967069#comment-13967069 ] Tongjie Chen commented on HIVE-6785: [~szehon], are those failure transient? the new patch only changes the parquet class, which has nothing to do with these test cases. query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, HIVE-6785.3.patch When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6876) Logging information should include thread id
[ https://issues.apache.org/jira/browse/HIVE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6876: --- Affects Version/s: (was: 0.14.0) Logging information should include thread id Key: HIVE-6876 URL: https://issues.apache.org/jira/browse/HIVE-6876 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6876.1.patch The multi-threaded nature of hive server and remote metastore makes it difficult to debug issues without enabling thread information. It would be nice to have the thread id in the logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6876) Logging information should include thread id
[ https://issues.apache.org/jira/browse/HIVE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6876: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Logging information should include thread id Key: HIVE-6876 URL: https://issues.apache.org/jira/browse/HIVE-6876 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6876.1.patch The multi-threaded nature of hive server and remote metastore makes it difficult to debug issues without enabling thread information. It would be nice to have the thread id in the logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6893) Out of sequence response
[ https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967082#comment-13967082 ] Ashutosh Chauhan commented on HIVE-6893: [~romainr] Can you provide more details about your setup? Was metastore running as separate process or embedded within HS2 ? Were HS2 clients odbc or jdbc ? How many concurrent clients were there and how many queries they were firing? Out of sequence response Key: HIVE-6893 URL: https://issues.apache.org/jira/browse/HIVE-6893 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Romain Rigaux Calls listing databases or tables fail. It seems to be a concurrency problem. {code} 014-03-06 05:34:00,785 ERROR hive.log: org.apache.thrift.TApplicationException: get_databases failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278) at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57) at com.sun.proxy.$Proxy9.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967087#comment-13967087 ] Szehon Ho commented on HIVE-6785: - Yea, it doesnt look related to this patch. query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, HIVE-6785.3.patch When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6893) Out of sequence response
[ https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967089#comment-13967089 ] Vaibhav Gumashta commented on HIVE-6893: [~romainr] This seems like a metastore concurrency issue. It seems that the metastore client socket is reading the RPC response from a different call, hence the out of sequence exception. I think we'll have a better idea once we learn more about [~ashutoshc]'s queries. Out of sequence response Key: HIVE-6893 URL: https://issues.apache.org/jira/browse/HIVE-6893 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Romain Rigaux Calls listing databases or tables fail. It seems to be a concurrency problem. {code} 014-03-06 05:34:00,785 ERROR hive.log: org.apache.thrift.TApplicationException: get_databases failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278) at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57) at com.sun.proxy.$Proxy9.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6893) Out of sequence response
[ https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967091#comment-13967091 ] Vaibhav Gumashta commented on HIVE-6893: [~romainr] As a followup, can you also try starting HiveServer2 with -hiveconf hive.metastore.uris= and see if you get the same error? Out of sequence response Key: HIVE-6893 URL: https://issues.apache.org/jira/browse/HIVE-6893 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Romain Rigaux Calls listing databases or tables fail. It seems to be a concurrency problem. {code} 014-03-06 05:34:00,785 ERROR hive.log: org.apache.thrift.TApplicationException: get_databases failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278) at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57) at com.sun.proxy.$Proxy9.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Status: Patch Available (was: Open) Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Attachment: HIVE-6432.3.patch Attached latest rebased patch. Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5523) HiveHBaseStorageHandler should pass kerbros credentials down to HBase
[ https://issues.apache.org/jira/browse/HIVE-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967116#comment-13967116 ] Ashutosh Chauhan commented on HIVE-5523: +1 HiveHBaseStorageHandler should pass kerbros credentials down to HBase - Key: HIVE-5523 URL: https://issues.apache.org/jira/browse/HIVE-5523 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0 Reporter: Nick Dimiduk Assignee: Sushanth Sowmyan Attachments: HIVE-5523.patch, Task Logs_ 'attempt_201310110032_0023_r_00_0'.html Running on a secured cluster, I have an HBase table defined thusly {noformat} CREATE TABLE IF NOT EXISTS pagecounts_hbase (rowkey STRING, pageviews STRING, bytes STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('hbase.table.name' = 'pagecounts'); {noformat} and a query to populate that table {noformat} -- ensure hbase dependency jars are shipped with the MR job SET hive.aux.jars.path = file:///etc/hbase/conf/hbase-site.xml,file:///usr/lib/hive/lib/hive-hbase-handler-0.11.0.1.3.2.0-111.jar,file:///usr/lib/hbase/hbase-0.94.6.1.3.2.0-111-security.jar,file:///usr/lib/zookeeper/zookeeper-3.4.5.1.3.2.0-111.jar; -- populate our hbase table FROM pgc INSERT INTO TABLE pagecounts_hbase SELECT pgc.* WHERE rowkey LIKE 'en/q%' LIMIT 10; {noformat} The reduce tasks fail with what boils down to the following exception: {noformat} Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:263) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37) at org.apache.hadoop.hbase.security.User.call(User.java:590) at org.apache.hadoop.hbase.security.User.access$700(User.java:51) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:224) at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:313) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104) at $Proxy10.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1346) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1305) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1292) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1001) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:896) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:998) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:900) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:857) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:174) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:133) at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat.getHiveRecordWriter(HiveHBaseTableOutputFormat.java:83) at
[jira] [Commented] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967119#comment-13967119 ] Ashutosh Chauhan commented on HIVE-6432: +1 Lets see how tests go! Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
Vikram Dixit K created HIVE-6898: Summary: Functions in hive are failing with java.lang.ClassNotFoundException on Tez Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6898: - Description: {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} was: {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) {code} Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square
[jira] [Commented] (HIVE-5269) Use thrift binary type for conveying binary values in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967158#comment-13967158 ] Ashutosh Chauhan commented on HIVE-5269: +1 [~navis] Lets get this in. Needs a rebase. Use thrift binary type for conveying binary values in hiveserver2 - Key: HIVE-5269 URL: https://issues.apache.org/jira/browse/HIVE-5269 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5269.2.patch.txt, HIVE-5269.D12873.1.patch Currently, binary type is encoded to string in hiveserver2 and decoded in client. Just using binary type might make it simpler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6898: - Status: Patch Available (was: Open) Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6898.1.patch {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6898: - Attachment: HIVE-6898.1.patch Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6898.1.patch {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3595) Hive should adapt new FsShell commands since Hadoop 2 has changed FsShell argument structures
[ https://issues.apache.org/jira/browse/HIVE-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967172#comment-13967172 ] Ashutosh Chauhan commented on HIVE-3595: +1 Lets get this in [~navis] Needs a rebase. Hive should adapt new FsShell commands since Hadoop 2 has changed FsShell argument structures - Key: HIVE-3595 URL: https://issues.apache.org/jira/browse/HIVE-3595 Project: Hive Issue Type: Improvement Components: Shims Affects Versions: 0.9.0 Reporter: Harsh J Assignee: Navis Priority: Minor Attachments: HIVE-3595.1.patch.txt A simple example is that hive calls -rmr in the FsShell class, which in Hadoop 2 is rm -r. This helps avoid printing an unnecessary Deprecated warning in Hive when the Hadoop23 (or hadoop-2) shim is in use. We should wrap the logic and call the right commands of hadoop-2 to avoid this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967181#comment-13967181 ] Hive QA commented on HIVE-5847: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639852/HIVE-5847.1.patch {color:green}SUCCESS:{color} +1 5614 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build//testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build//console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12639852 DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch, HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5269) Use thrift binary type for conveying binary values in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967189#comment-13967189 ] Hive QA commented on HIVE-5269: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609771/HIVE-5269.2.patch.txt Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2225/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2225/console Messages: {noformat} This message was trimmed, see log for full details Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/OutputCommitterContainer.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/ProgressReporter.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/StorerInfo.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatSplit.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/DefaultOutputCommitterContainer.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/oozie/JavaAction.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzerBase.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/CreateDatabaseHook.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/CreateTableHook.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/HCatCli.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/cli/HCatDriver.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/security/HdfsAuthorizationProvider.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/security/StorageDelegationAuthorizationProvider.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/HCatContext.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/ErrorType.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/HCatConstants.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/HCatUtil.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/HCatException.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/common/HiveClientCache.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/ReaderWriter.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatSchema.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatSchemaUtils.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatFieldSchema.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/HCatRecordSerDe.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/HCatRecordable.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/DefaultHCatRecord.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/state/DefaultStateProvider.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/state/StateProvider.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatOutputFormatWriter.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatInputFormatReader.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/WriterContext.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/HCatReader.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/DataTransferFactory.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/EntityBase.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/ReaderContext.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/WriteEntity.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/ReadEntity.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/HCatWriter.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/HCatRecordObjectInspectorFactory.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/DataType.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/Pair.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/HCatRecordObjectInspector.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/JsonSerDe.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/HCatRecord.java' Reverted 'hcatalog/core/src/main/java/org/apache/hcatalog/data/LazyHCatRecord.java' Reverted
[jira] [Commented] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967186#comment-13967186 ] Hive QA commented on HIVE-6432: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639868/HIVE-6432.3.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2224/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2224/console Messages: {noformat} This message was trimmed, see log for full details [INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-testutils --- [INFO] [INFO] [INFO] Building Hive Packaging 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-packaging --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-packaging --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-packaging --- [INFO] Executing tasks main: [delete] Deleting directory /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/tmp [delete] Deleting directory /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/packaging/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] Reactor Summary: [INFO] [INFO] Hive .. SUCCESS [4.411s] [INFO] Hive Ant Utilities SUCCESS [3.790s] [INFO] Hive Shims Common . SUCCESS [3.082s] [INFO] Hive Shims 0.20 ... SUCCESS [2.231s] [INFO] Hive Shims Secure Common .. SUCCESS [2.274s] [INFO] Hive Shims 0.20S .. SUCCESS [1.764s] [INFO] Hive Shims 0.23 ... SUCCESS [5.358s] [INFO] Hive Shims SUCCESS [0.720s] [INFO] Hive Common ... SUCCESS [4.632s] [INFO] Hive Serde SUCCESS [1.604s] [INFO] Hive Metastore SUCCESS [6.245s] [INFO] Hive Query Language ... SUCCESS [23.350s] [INFO] Hive Service .. SUCCESS [3.951s] [INFO] Hive JDBC . SUCCESS [1.662s] [INFO] Hive Beeline .. SUCCESS [1.982s] [INFO] Hive CLI .. SUCCESS [1.371s] [INFO] Hive Contrib .. SUCCESS [1.964s] [INFO] Hive HBase Handler SUCCESS [2.993s] [INFO] Hive HCatalog . SUCCESS [0.318s] [INFO] Hive HCatalog Core SUCCESS [0.993s] [INFO] Hive HCatalog Pig Adapter . SUCCESS [2.272s] [INFO] Hive HCatalog Server Extensions ... SUCCESS [1.434s] [INFO] Hive HCatalog Webhcat Java Client . SUCCESS [1.632s] [INFO] Hive HCatalog Webhcat . SUCCESS [10.408s] [INFO] Hive HCatalog Streaming ... SUCCESS [1.733s] [INFO] Hive HWI .. SUCCESS [1.074s] [INFO] Hive ODBC . SUCCESS [0.958s] [INFO] Hive Shims Aggregator . SUCCESS [0.368s] [INFO] Hive TestUtils SUCCESS [0.411s] [INFO] Hive Packaging SUCCESS [1.178s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1:40.924s [INFO] Finished at: Fri Apr 11 18:36:32 EDT 2014 [INFO] Final Memory: 52M/146M [INFO] + cd itests + mvn -B clean install -DskipTests -Dmaven.repo.local=/data/hive-ptest/working/maven -Phadoop-1 [INFO] Scanning for projects... [INFO]
[jira] [Commented] (HIVE-6726) Hcat cli does not close SessionState
[ https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967194#comment-13967194 ] Thejas M Nair commented on HIVE-6726: - +1 Hcat cli does not close SessionState Key: HIVE-6726 URL: https://issues.apache.org/jira/browse/HIVE-6726 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6726.patch When running HCat E2E tests, it was observed that hcat cli left Tez sessions on the RM which ultimately die upon timeout. Expected behavior is to clean the Tez sessions immediately upon exit. This is causing slowness in system tests as over time lot of orphan Tez sessions hang around. On looking through code, it seems obvious in retrospect because HCatCli starts a SessionState, but does not explicitly call close on them, exiting the jvm through System.exit instead. This needs to be changed to explicitly call SessionState.close() before exiting. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6899) Add an ability to specify the type of execution to use (async/sync execution) on JDBC client
Vaibhav Gumashta created HIVE-6899: -- Summary: Add an ability to specify the type of execution to use (async/sync execution) on JDBC client Key: HIVE-6899 URL: https://issues.apache.org/jira/browse/HIVE-6899 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 I think it will be useful for testing purpose. Currently, for doing any comparisons (or testing any regression), the jdbc client needs to be recompiled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967215#comment-13967215 ] Sergey Shelukhin commented on HIVE-6898: Couple of nits on rb, can be fixed on commit, +1 Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6898.1.patch {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6888) Hive leaks MapWork objects via Utilities::gWorkMap
[ https://issues.apache.org/jira/browse/HIVE-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6888: --- Fix Version/s: (was: 0.13.0) Hive leaks MapWork objects via Utilities::gWorkMap -- Key: HIVE-6888 URL: https://issues.apache.org/jira/browse/HIVE-6888 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Gunther Hagleitner Attachments: HIVE-6888.patch When running multiple queries with hive on a single Application Master, we found that hive leaks a large number of MapWork objects which accumulate in the AM -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6745) HCat MultiOutputFormat hardcodes DistributedCache keynames
[ https://issues.apache.org/jira/browse/HIVE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6745: --- Affects Version/s: 0.14.0 0.13.0 Status: Patch Available (was: Open) Marking as patch-available. HCat MultiOutputFormat hardcodes DistributedCache keynames -- Key: HIVE-6745 URL: https://issues.apache.org/jira/browse/HIVE-6745 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0, 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6745.patch There's a bug in how MultiOutputFormat deals with DistributedCache, in that it hardcodes the parameter name to merge for distributed cache entries in the jobconf. This parameter name has changed with recent builds of 2.x, thus causing a test failure. These parameters need to be properly shimmed out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Attachment: HIVE-6432.4.patch Missed removing one more dependency in the qtest dir. Updated. Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.4.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Status: Open (was: Patch Available) Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.4.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Status: Patch Available (was: Open) Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.14.0 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, HIVE-6432.3.patch, HIVE-6432.4.patch, HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, hcat.6432.test.out There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6745) HCat MultiOutputFormat hardcodes DistributedCache keynames
[ https://issues.apache.org/jira/browse/HIVE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967233#comment-13967233 ] Ashutosh Chauhan commented on HIVE-6745: +1 HCat MultiOutputFormat hardcodes DistributedCache keynames -- Key: HIVE-6745 URL: https://issues.apache.org/jira/browse/HIVE-6745 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0, 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6745.patch There's a bug in how MultiOutputFormat deals with DistributedCache, in that it hardcodes the parameter name to merge for distributed cache entries in the jobconf. This parameter name has changed with recent builds of 2.x, thus causing a test failure. These parameters need to be properly shimmed out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6888) Hive leaks MapWork objects via Utilities::gWorkMap
[ https://issues.apache.org/jira/browse/HIVE-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6888: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) committed to trunk. Would be nice to get into 13 if there's another RC Hive leaks MapWork objects via Utilities::gWorkMap -- Key: HIVE-6888 URL: https://issues.apache.org/jira/browse/HIVE-6888 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-6888.patch When running multiple queries with hive on a single Application Master, we found that hive leaks a large number of MapWork objects which accumulate in the AM -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20243: HIVE-6891 - Alter rename partition Perm inheritance and general partition/table owner inheritance
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20243/ --- (Updated April 11, 2014, 11:41 p.m.) Review request for hive. Changes --- Mkdirs will change only the group as discussed, as hive is not superuser. Tested on a cluster, verified it works as expected if hive is a member of that group. I will try to refactor the unit test with MiniDFS to verify the group part in a follow-up JIRA, as I cannot create other groups as its running on real local file system. Bugs: HIVE-6891 https://issues.apache.org/jira/browse/HIVE-6891 Repository: hive-git Description --- This is a follow-up of HIVE-6648. Extending the fix to other partition/table operations as well, by refactoring the fixed code in HIVE-6648 into a common FileUtils helper method, and then using it for all table/partition directory creation operation, when hive.warehouse.subdir.inherit.perms flag is set. Another part of this change is to add ownership inheritance in this code as well when creating directories. Ownership was already inherited for data (HIVE-3756), but not at the table/partitioned-table level. Diffs (updated) - common/src/java/org/apache/hadoop/hive/common/FileUtils.java ad82f62 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestFolderPermissions.java f1c7b7b metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 8345d70 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java c62e085 metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java f731dab ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2559e0e Diff: https://reviews.apache.org/r/20243/diff/ Testing --- Extending the unit test TestFolderPermission to handle all the new cases of directory creation (create table, external table, static partition, dynamic partition, rename partition). Unfortunately due to the test using the local file system, I cannot add the ownership inheritance to unit-testing. I can probably look into using MiniDFS for that, in a follow-up JIRA. Thanks, Szehon Ho
[jira] [Updated] (HIVE-6891) Alter rename partition Perm inheritance and general partition/table owner inheritance
[ https://issues.apache.org/jira/browse/HIVE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6891: Attachment: HIVE-6891.2.patch Addressing review comments. Alter rename partition Perm inheritance and general partition/table owner inheritance - Key: HIVE-6891 URL: https://issues.apache.org/jira/browse/HIVE-6891 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6891.2.patch, HIVE-6891.patch Found this issue while looking at the method mentioned by HIVE-6648. Doing 'alter table .. partition .. rename to ..' does not respect permission inheritance. Also, in these scenarios of directory creation, ownership is not being inherited. Data files are already inheriting owner by HIVE-3756. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6891) Alter rename partition Perm inheritance and general partition/table owner inheritance
[ https://issues.apache.org/jira/browse/HIVE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6891: Status: Patch Available (was: Open) Alter rename partition Perm inheritance and general partition/table owner inheritance - Key: HIVE-6891 URL: https://issues.apache.org/jira/browse/HIVE-6891 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6891.2.patch, HIVE-6891.patch Found this issue while looking at the method mentioned by HIVE-6648. Doing 'alter table .. partition .. rename to ..' does not respect permission inheritance. Also, in these scenarios of directory creation, ownership is not being inherited. Data files are already inheriting owner by HIVE-3756. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6891) Alter rename partition Perm inheritance and general partition/table group inheritance
[ https://issues.apache.org/jira/browse/HIVE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6891: Summary: Alter rename partition Perm inheritance and general partition/table group inheritance (was: Alter rename partition Perm inheritance and general partition/table owner inheritance) Alter rename partition Perm inheritance and general partition/table group inheritance - Key: HIVE-6891 URL: https://issues.apache.org/jira/browse/HIVE-6891 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6891.2.patch, HIVE-6891.patch Found this issue while looking at the method mentioned by HIVE-6648. Doing 'alter table .. partition .. rename to ..' does not respect permission inheritance. Also, in these scenarios of directory creation, ownership is not being inherited. Data files are already inheriting owner by HIVE-3756. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6891) Alter rename partition Perm inheritance and general partition/table group inheritance
[ https://issues.apache.org/jira/browse/HIVE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6891: Description: Found this issue while looking at the method mentioned by HIVE-6648. 'alter table .. partition .. rename to ..' and other commands calling Warehouse.mkdirs() doesn't inherit permission on the partition directories and consequently the data, when hive.warehouse.subdir.inherit.perms is set. Also, in these scenarios of directory creation, group is not being inherited. Data files are already inheriting group by HIVE-3756. was: Found this issue while looking at the method mentioned by HIVE-6648. Doing 'alter table .. partition .. rename to ..' does not respect permission inheritance. Also, in these scenarios of directory creation, ownership is not being inherited. Data files are already inheriting owner by HIVE-3756. Alter rename partition Perm inheritance and general partition/table group inheritance - Key: HIVE-6891 URL: https://issues.apache.org/jira/browse/HIVE-6891 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6891.2.patch, HIVE-6891.patch Found this issue while looking at the method mentioned by HIVE-6648. 'alter table .. partition .. rename to ..' and other commands calling Warehouse.mkdirs() doesn't inherit permission on the partition directories and consequently the data, when hive.warehouse.subdir.inherit.perms is set. Also, in these scenarios of directory creation, group is not being inherited. Data files are already inheriting group by HIVE-3756. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6898: - Attachment: HIVE-6898.2.patch Address Sergey's comments. Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6898.1.patch, HIVE-6898.2.patch {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
[ https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967259#comment-13967259 ] Szehon Ho commented on HIVE-6785: - Looked at the new patch, looks fine with me query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe -- Key: HIVE-6785 URL: https://issues.apache.org/jira/browse/HIVE-6785 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, HIVE-6785.3.patch When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of other SerDe, AND if this table has string column[s], hive generates confusing error message: Failed with exception java.io.IOException:java.lang.ClassCastException: parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector This is confusing because timestamp is mentioned even if it is not been used by the table. The reason is when there is SerDe difference between table and partition, hive tries to convert objectinspector of two SerDes. ParquetHiveSerDe's object inspector for string type is ParquetStringInspector (newly introduced), neither a subclass of WritableStringObjectInspector nor JavaStringObjectInspector, which ObjectInspectorConverters expect for string category objector inspector. There is no break statement in STRING case statement, hence the following TIMESTAMP case statement is executed, generating confusing error message. see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/324 To fix that it is relatively easy, just make ParquetStringInspector subclass of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But because constructor of class JavaStringObjectInspector is package scope instead of public or protected, we would need to move ParquetStringInspector to the same package with JavaStringObjectInspector. Also ArrayWritableObjectInspector's setStructFieldData needs to also accept List data, since the corresponding setStructFieldData and create methods return a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition SerDe is something else. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6447) Bucket map joins in hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967266#comment-13967266 ] Vikram Dixit K commented on HIVE-6447: -- Hi Chris, I completely cleaned my .m2 and built this and it passes. I think the tez folks fixed this spelling issue in 0.4.0-incubating-SNAPSHOT and beyond. Hive is currently dependent on 0.4.0-incubating only. This works with that version of tez. Thanks Vikram. Bucket map joins in hive-tez Key: HIVE-6447 URL: https://issues.apache.org/jira/browse/HIVE-6447 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, HIVE-6447.WIP.patch Support bucket map joins in tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6447) Bucket map joins in hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967268#comment-13967268 ] Vikram Dixit K commented on HIVE-6447: -- Hi Chris, I completely cleaned my .m2 and built this and it passes. I think the tez folks fixed this spelling issue in 0.4.0-incubating-SNAPSHOT and beyond. Hive is currently dependent on 0.4.0-incubating only. This works with that version of tez. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark Bucket map joins in hive-tez Key: HIVE-6447 URL: https://issues.apache.org/jira/browse/HIVE-6447 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, HIVE-6447.WIP.patch Support bucket map joins in tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6784) parquet-hive should allow column type change
[ https://issues.apache.org/jira/browse/HIVE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongjie Chen updated HIVE-6784: --- Status: Open (was: Patch Available) parquet-hive should allow column type change Key: HIVE-6784 URL: https://issues.apache.org/jira/browse/HIVE-6784 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6784.1.patch.txt, HIVE-6784.2.patch.txt see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/323 Currently, if we change parquet format hive table using alter table parquet_table change c1 c1 bigint ( assuming original type of c1 is int), it will result in exception thrown from SerDe: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable in query runtime. This is different behavior from hive (using other file format), where it will try to perform cast (null value in case of incompatible type). Parquet Hive's RecordReader returns an ArrayWritable (based on schema stored in footers of parquet files); ParquetHiveSerDe also creates an corresponding ArrayWritableObjectInspector (but using column type info from metastore). Whenever there is column type change, the objector inspector will throw exception, since WritableLongObjectInspector cannot inspect an IntWritable etc... Conversion has to happen somewhere if we want to allow type change. SerDe's deserialize method seems a natural place for it. Currently, serialize method calls createStruct (then createPrimitive) for every record, but it creates a new object regardless, which seems expensive. I think that could be optimized a bit by just returning the object passed if already of the right type. deserialize also reuse this method, if there is a type change, there will be new object to be created, which I think is inevitable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6900) HostUtil.getTaskLogUrl signature change causes compilation to fail
Chris Drome created HIVE-6900: - Summary: HostUtil.getTaskLogUrl signature change causes compilation to fail Key: HIVE-6900 URL: https://issues.apache.org/jira/browse/HIVE-6900 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.13.0, 0.14.0 Reporter: Chris Drome The signature for HostUtil.getTaskLogUrl has changed between Hadoop-2.3 and Hadoop-2.4. Code in shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java works with Hadoop-2.3 method and causes compilation failure with Hadoop-2.4. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6447) Bucket map joins in hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967287#comment-13967287 ] Chris Drome commented on HIVE-6447: --- Thanks for the clarification. We are using a newer version of Tez, which caused the problem. Bucket map joins in hive-tez Key: HIVE-6447 URL: https://issues.apache.org/jira/browse/HIVE-6447 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, HIVE-6447.WIP.patch Support bucket map joins in tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6784) parquet-hive should allow column type change
[ https://issues.apache.org/jira/browse/HIVE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongjie Chen resolved HIVE-6784. Resolution: Won't Fix parquet-hive should allow column type change Key: HIVE-6784 URL: https://issues.apache.org/jira/browse/HIVE-6784 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6784.1.patch.txt, HIVE-6784.2.patch.txt see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/323 Currently, if we change parquet format hive table using alter table parquet_table change c1 c1 bigint ( assuming original type of c1 is int), it will result in exception thrown from SerDe: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable in query runtime. This is different behavior from hive (using other file format), where it will try to perform cast (null value in case of incompatible type). Parquet Hive's RecordReader returns an ArrayWritable (based on schema stored in footers of parquet files); ParquetHiveSerDe also creates an corresponding ArrayWritableObjectInspector (but using column type info from metastore). Whenever there is column type change, the objector inspector will throw exception, since WritableLongObjectInspector cannot inspect an IntWritable etc... Conversion has to happen somewhere if we want to allow type change. SerDe's deserialize method seems a natural place for it. Currently, serialize method calls createStruct (then createPrimitive) for every record, but it creates a new object regardless, which seems expensive. I think that could be optimized a bit by just returning the object passed if already of the right type. deserialize also reuse this method, if there is a type change, there will be new object to be created, which I think is inevitable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6784) parquet-hive should allow column type change
[ https://issues.apache.org/jira/browse/HIVE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967292#comment-13967292 ] Tongjie Chen commented on HIVE-6784: OK, I will cancel this patch. The exception raised from changing type actually only happens to non-partitioned tables. For partitioned tables, if there is type change in table level, there will be an ObjectInspectorConverter (in parquet's case --- StructConverter) to convert type between partition and table. For non-partitioned tables, the ObjectInspectorConverter is always IdentityConverter, which passes the deserialized object as it is, causing type mismatch between object and ObjectInspector. For now, we can live with the workaround (insert overwrite table) given that it is only affecting non-partitioned tables and relatively rare. parquet-hive should allow column type change Key: HIVE-6784 URL: https://issues.apache.org/jira/browse/HIVE-6784 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Tongjie Chen Fix For: 0.14.0 Attachments: HIVE-6784.1.patch.txt, HIVE-6784.2.patch.txt see also in the following parquet issue: https://github.com/Parquet/parquet-mr/issues/323 Currently, if we change parquet format hive table using alter table parquet_table change c1 c1 bigint ( assuming original type of c1 is int), it will result in exception thrown from SerDe: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable in query runtime. This is different behavior from hive (using other file format), where it will try to perform cast (null value in case of incompatible type). Parquet Hive's RecordReader returns an ArrayWritable (based on schema stored in footers of parquet files); ParquetHiveSerDe also creates an corresponding ArrayWritableObjectInspector (but using column type info from metastore). Whenever there is column type change, the objector inspector will throw exception, since WritableLongObjectInspector cannot inspect an IntWritable etc... Conversion has to happen somewhere if we want to allow type change. SerDe's deserialize method seems a natural place for it. Currently, serialize method calls createStruct (then createPrimitive) for every record, but it creates a new object regardless, which seems expensive. I think that could be optimized a bit by just returning the object passed if already of the right type. deserialize also reuse this method, if there is a type change, there will be new object to be created, which I think is inevitable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6898) Functions in hive are failing with java.lang.ClassNotFoundException on Tez
[ https://issues.apache.org/jira/browse/HIVE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967295#comment-13967295 ] Hive QA commented on HIVE-6898: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639877/HIVE-6898.1.patch {color:green}SUCCESS:{color} +1 5614 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2226/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2226/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12639877 Functions in hive are failing with java.lang.ClassNotFoundException on Tez -- Key: HIVE-6898 URL: https://issues.apache.org/jira/browse/HIVE-6898 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6898.1.patch, HIVE-6898.2.patch {code} CREATE TABLE T1(key int, val STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/T1.txt' INTO TABLE T1; add jar /tmp/testudf.jar; create temporary function square as 'org.apache.hive.udf.UDFSquare'; select square(key) from T1 limit 3; {code} Fails with {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1397230190905_0590_1_00, diagnostics=[Task failed, taskId=task_1397230190905_0590_1_00_00, diagnostics=[AttemptID:attempt_1397230190905_0590_1_00_00_0 Info:Error: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:145) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:163) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307) at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.udf.UDFSquare at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:133) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1636) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1599) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:460) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:416) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:189) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:121) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6883) Dynamic partitioning optimization does not honor sort order or order by
[ https://issues.apache.org/jira/browse/HIVE-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6883: --- Resolution: Fixed Fix Version/s: (was: 0.13.0) 0.14.0 Status: Resolved (was: Patch Available) committed to trunk Dynamic partitioning optimization does not honor sort order or order by --- Key: HIVE-6883 URL: https://issues.apache.org/jira/browse/HIVE-6883 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-6883.1.patch, HIVE-6883.2.patch, HIVE-6883.3.patch HIVE-6455 patch does not honor sort order of the output table or order by of select statement. The reason for the former is numDistributionKey in ReduceSinkDesc is set wrongly. It doesn't take into account the sort columns, because of this RSOp sets the sort columns to null in Key. Since nulls are set in place of sort columns in Key, the sort columns in Value are not sorted. The other issue is ORDER BY columns are not honored during insertion. For example {code} insert overwrite table over1k_part_orc partition(ds=foo, t) select si,i,b,f,t from over1k_orc where t is null or t=27 order by si; {code} the select query performs order by on column 'si' in the first MR job. The following MR job (inserted by HIVE-6455), sorts the input data on dynamic partition column 't' without taking into account the already sorted 'si' column. This results in out of order insertion for 'si' column. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5336) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
[ https://issues.apache.org/jira/browse/HIVE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967317#comment-13967317 ] Sushanth Sowmyan commented on HIVE-5336: Attached. Also, I realize pre-commit tests already ran on the patch before, but I'd like to see a more recent run before I can commit. HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user -- Key: HIVE-5336 URL: https://issues.apache.org/jira/browse/HIVE-5336 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5336.1.patch.txt, HIVE-5336.2.patch.txt, HIVE-5336.3.patch HCatSchema.remove currently does not renumber the fieldPositionMap which can be a problem when there are interleaving append() and remove() calls. 1. We should document that fieldPositionMap should not be cached by the end-user 2. We should make sure that the fieldPositionMap gets renumbered after remove() because HcatSchema.get will otherwise return wrong FieldSchemas. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5336) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
[ https://issues.apache.org/jira/browse/HIVE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967316#comment-13967316 ] Sushanth Sowmyan commented on HIVE-5336: Patch looks good to me, +1 on intent. It does not apply on current trunk, however, so I rebased it so we could run our precommit tests on it. Attaching that patch. HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user -- Key: HIVE-5336 URL: https://issues.apache.org/jira/browse/HIVE-5336 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5336.1.patch.txt, HIVE-5336.2.patch.txt, HIVE-5336.3.patch HCatSchema.remove currently does not renumber the fieldPositionMap which can be a problem when there are interleaving append() and remove() calls. 1. We should document that fieldPositionMap should not be cached by the end-user 2. We should make sure that the fieldPositionMap gets renumbered after remove() because HcatSchema.get will otherwise return wrong FieldSchemas. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5336) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
[ https://issues.apache.org/jira/browse/HIVE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5336: --- Status: Patch Available (was: Open) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user -- Key: HIVE-5336 URL: https://issues.apache.org/jira/browse/HIVE-5336 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5336.1.patch.txt, HIVE-5336.2.patch.txt, HIVE-5336.3.patch HCatSchema.remove currently does not renumber the fieldPositionMap which can be a problem when there are interleaving append() and remove() calls. 1. We should document that fieldPositionMap should not be cached by the end-user 2. We should make sure that the fieldPositionMap gets renumbered after remove() because HcatSchema.get will otherwise return wrong FieldSchemas. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5336) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
[ https://issues.apache.org/jira/browse/HIVE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5336: --- Status: Open (was: Patch Available) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user -- Key: HIVE-5336 URL: https://issues.apache.org/jira/browse/HIVE-5336 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5336.1.patch.txt, HIVE-5336.2.patch.txt, HIVE-5336.3.patch HCatSchema.remove currently does not renumber the fieldPositionMap which can be a problem when there are interleaving append() and remove() calls. 1. We should document that fieldPositionMap should not be cached by the end-user 2. We should make sure that the fieldPositionMap gets renumbered after remove() because HcatSchema.get will otherwise return wrong FieldSchemas. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5336) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
[ https://issues.apache.org/jira/browse/HIVE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5336: --- Attachment: HIVE-5336.3.patch HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user -- Key: HIVE-5336 URL: https://issues.apache.org/jira/browse/HIVE-5336 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5336.1.patch.txt, HIVE-5336.2.patch.txt, HIVE-5336.3.patch HCatSchema.remove currently does not renumber the fieldPositionMap which can be a problem when there are interleaving append() and remove() calls. 1. We should document that fieldPositionMap should not be cached by the end-user 2. We should make sure that the fieldPositionMap gets renumbered after remove() because HcatSchema.get will otherwise return wrong FieldSchemas. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null
[ https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967324#comment-13967324 ] Sushanth Sowmyan commented on HIVE-6035: I'm going to cancel this patch, upload a new version of the same patch (so it applies cleanly on trunk) and re-mark it as patch available. If precommit tests do not have a problem, I'll go ahead and commit. Windows: percentComplete returned by job status from WebHCat is null Key: HIVE-6035 URL: https://issues.apache.org/jira/browse/HIVE-6035 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-6035.1.patch, HIVE-6035.2.patch, HIVE-6035.3.patch, HIVE-6035.patch HIVE-5511 fixed the same problem on Linux, but it still broke on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null
[ https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6035: --- Attachment: HIVE-6035.3.patch Attached new patch. Also, please note that I have removed the fix version for this patch, since fix version is supposed to be set only after a commit has been made, and tracks what version the patch has been applied to, not which version the commit is desired in. Windows: percentComplete returned by job status from WebHCat is null Key: HIVE-6035 URL: https://issues.apache.org/jira/browse/HIVE-6035 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-6035.1.patch, HIVE-6035.2.patch, HIVE-6035.3.patch, HIVE-6035.patch HIVE-5511 fixed the same problem on Linux, but it still broke on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null
[ https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6035: --- Fix Version/s: (was: 0.13.0) Status: Open (was: Patch Available) Windows: percentComplete returned by job status from WebHCat is null Key: HIVE-6035 URL: https://issues.apache.org/jira/browse/HIVE-6035 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-6035.1.patch, HIVE-6035.2.patch, HIVE-6035.3.patch, HIVE-6035.patch HIVE-5511 fixed the same problem on Linux, but it still broke on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null
[ https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6035: --- Status: Patch Available (was: Open) Windows: percentComplete returned by job status from WebHCat is null Key: HIVE-6035 URL: https://issues.apache.org/jira/browse/HIVE-6035 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-6035.1.patch, HIVE-6035.2.patch, HIVE-6035.3.patch, HIVE-6035.patch HIVE-5511 fixed the same problem on Linux, but it still broke on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967333#comment-13967333 ] Sushanth Sowmyan commented on HIVE-6480: +1, Looks good to me, will commit. Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967334#comment-13967334 ] Sushanth Sowmyan commented on HIVE-6480: Committed. Thanks for the contribution, Adam! Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6480: --- Fix Version/s: 0.14.0 Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6480: Assignee: Adam Faris Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Assignee: Adam Faris Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6480: --- Resolution: Fixed Status: Resolved (was: Patch Available) Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Assignee: Adam Faris Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6901: -- Description: Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} was: Explaining a simple select query that involves a MR phase doesn't show complete processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
Xuefu Zhang created HIVE-6901: - Summary: Explain plan doesn't show operator tree for the fetch operator Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Explaining a simple select query that involves a MR phase doesn't show complete processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6745) HCat MultiOutputFormat hardcodes DistributedCache keynames
[ https://issues.apache.org/jira/browse/HIVE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967352#comment-13967352 ] Hive QA commented on HIVE-6745: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12636762/HIVE-6745.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5614 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2227/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2227/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12636762 HCat MultiOutputFormat hardcodes DistributedCache keynames -- Key: HIVE-6745 URL: https://issues.apache.org/jira/browse/HIVE-6745 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0, 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6745.patch There's a bug in how MultiOutputFormat deals with DistributedCache, in that it hardcodes the parameter name to merge for distributed cache entries in the jobconf. This parameter name has changed with recent builds of 2.x, thus causing a test failure. These parameters need to be properly shimmed out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967358#comment-13967358 ] Hive QA commented on HIVE-6432: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639883/HIVE-6432.4.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2228/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2228/console Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-hcatalog-it-unit --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-hcatalog-it-unit --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hcatalog-it-unit --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-hcatalog-it-unit --- [INFO] No sources to compile [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-hcatalog-it-unit --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hcatalog-it-unit --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-hcatalog-it-unit --- [INFO] Compiling 4 source files to /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/test-classes [INFO] - [WARNING] COMPILATION WARNING : [INFO] - [WARNING] Note: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/mapreduce/TestSequenceFileReadWrite.java uses or overrides a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [INFO] 2 warnings [INFO] - [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[59,49] cannot find symbol symbol: class SkeletonHBaseTest [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[71,33] cannot find symbol symbol : method getClass() location: class org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[71,16] internal error; cannot instantiate org.apache.hadoop.hive.conf.HiveConf.init at org.apache.hadoop.hive.conf.HiveConf to () [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[74,17] cannot find symbol symbol : method getFileSystem() location: class org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[76,9] cannot find symbol symbol : method getTestDir() location: class org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/hbase/TestPigHBaseStorageHandler.java:[83,41] cannot find symbol symbol : method getHbaseConf() location: class org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler [ERROR]
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6901: -- Status: Patch Available (was: Open) Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6901: -- Attachment: HIVE-6901.patch With the patch attached, the following will be shown instead: {code} Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} Expecting test output diffs. Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6895) Beeline -f Requires A New Line
[ https://issues.apache.org/jira/browse/HIVE-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967361#comment-13967361 ] Xuefu Zhang commented on HIVE-6895: --- It seems that the problem doesn't exist in trunk. I assume it has been fixed already. Would you mind verifying, [~eljefe6a]? Beeline -f Requires A New Line -- Key: HIVE-6895 URL: https://issues.apache.org/jira/browse/HIVE-6895 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Jesse Anderson When a file is run through Beeline, the file must contain a newline at the end of the file or the command will not run. The Beeline shell seems to disconnect without running the command. Here is the command without a newline: {noformat} [training@dev solution]$ beeline -u jdbc:hive2://dev.loudacre.com -n training -p training -f create-devicestatus-table.hql scan complete in 7ms Connecting to jdbc:hive2://dev.loudacre.com Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-cdh4.5.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.10.0-cdh4.5.0 by Apache Hive 0: jdbc:hive2://dev.loudacre.com DROP TABLE IF EXISTS devicestatus;Closing: org.apache.hive.jdbc.HiveConnection {noformat} Here is the same command with a newline: {noformat} [training@dev solution]$ beeline -u jdbc:hive2://dev.loudacre.com -n training -p training -f create-devicestatus-table.hql scan complete in 8ms Connecting to jdbc:hive2://dev.loudacre.com Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-cdh4.5.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.10.0-cdh4.5.0 by Apache Hive 0: jdbc:hive2://dev.loudacre.com DROP TABLE IF EXISTS devicestatus; No rows affected (0.222 seconds) 0: jdbc:hive2://dev.loudacre.com Closing: org.apache.hive.jdbc.HiveConnection {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6891) Alter rename partition Perm inheritance and general partition/table group inheritance
[ https://issues.apache.org/jira/browse/HIVE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967362#comment-13967362 ] Hive QA commented on HIVE-6891: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12639888/HIVE-6891.2.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2229/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2229/console Messages: {noformat} This message was trimmed, see log for full details [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hcatalog-it-unit --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/hive-hcatalog-it-unit-0.14.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-hcatalog-it-unit --- [INFO] [INFO] --- maven-jar-plugin:2.2:test-jar (default) @ hive-hcatalog-it-unit --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/hive-hcatalog-it-unit-0.14.0-SNAPSHOT-tests.jar [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-hcatalog-it-unit --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/hive-hcatalog-it-unit-0.14.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-hcatalog-it-unit/0.14.0-SNAPSHOT/hive-hcatalog-it-unit-0.14.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-hcatalog-it-unit/0.14.0-SNAPSHOT/hive-hcatalog-it-unit-0.14.0-SNAPSHOT.pom [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/itests/hcatalog-unit/target/hive-hcatalog-it-unit-0.14.0-SNAPSHOT-tests.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-hcatalog-it-unit/0.14.0-SNAPSHOT/hive-hcatalog-it-unit-0.14.0-SNAPSHOT-tests.jar [INFO] [INFO] [INFO] Building Hive Integration - Testing Utilities 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-util --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/itests/util (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-it-util --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-it-util --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/itests/util/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-it-util --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-it-util --- [INFO] Compiling 45 source files to /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-it-util --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/itests/util/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-util --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-it-util --- [INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-util --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-util --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/itests/util/target/hive-it-util-0.14.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-it-util --- [INFO] [INFO] ---