[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215448#comment-15215448
 ] 

Hive QA commented on HIVE-13365:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795530/HIVE-13365.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 788 failed/errored test(s), 6999 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-acid_vectorization.q-smb_mapjoin_2.q-exim_02_00_part_empty.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-alter_char2.q-ppd_join3.q-vectorization_14.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-alter_table_not_sorted.q-authorization_update.q-dynamic_partition_pruning.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-alter_view_rename.q-tez_bmj_schema_evolution.q-llap_uncompressed.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_1_sql_std.q-drop_index_removes_partition_dirs.q-udf_date_sub.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_cli_nonsql.q-cbo_rp_subq_in.q-rcfile_merge1.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_cli_stdconfigauth.q-vectorized_parquet.q-ba_table_union.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_create_table_owner_privs.q-create_func1.q-partition_wise_fileformat.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_role_grant2.q-bucketcontext_3.q-windowing_multipartitioning.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_view_1.q-vector_left_outer_join2.q-add_jar_pfile.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_view_disable_cbo_3.q-vector_groupby_3.q-decimal_udf.q-and-2-more
 - did not produce a TEST-*.xml file
TestCliDriver-authorization_view_disable_cbo_4.q-vectorization_13.q-udf_explode.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-auto_join18_multi_distinct.q-interval_udf.q-list_bucket_query_multiskew_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-auto_join30.q-unionall_unbalancedppd.q-lock1.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-auto_join4.q-mapjoin_decimal.q-input_dynamicserde.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-auto_join9.q-insert_into_with_schema.q-schema_evol_text_nonvec_mapwork_table.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-exim_03_nonpart_over_compat.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-autogen_colalias.q-compute_stats_date.q-union29.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-avro_add_column.q-orc_wide_table.q-query_with_semi.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-avro_decimal_native.q-alter_file_format.q-groupby3_map_skew.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-avro_joins.q-disallow_incompatible_type_change_off.q-udf_max.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-bool_literal.q-udf_hash.q-groupby4_map.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-bucket_map_join_tez1.q-ppd_random.q-vector_auto_smb_mapjoin_14.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-bucketmapjoin3.q-udf_round_3.q-udf_between.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-cbo_rp_join1.q-enforce_order.q-bucketcontext_4.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-cbo_rp_limit.q-show_columns.q-input31.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-cbo_rp_stats.q-skewjoinopt16.q-rename_column.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-cbo_rp_udf_udaf.q-groupby4_noskew.q-list_bucket_dml_11.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-cbo_rp_views.q-cbo_rp_semijoin.q-offset_limit_ppd_optimizer.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-columnStatsUpdateForStatsOptimizer_2.q-alter_partition_clusterby_sortby.q-udf_repeat.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-columnstats_partlvl_dp.q-smb_mapjoin_21.q-udf_sha1.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-columnstats_tbllvl.q-index_compact.q-input14.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-compute_stats_string.q-load_dyn_part12.q-nullgroup4_multi_distinct.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-cp_mj_rc.q-masking_disablecbo_1.q-udf_stddev_pop.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-create_genericudf.q-ambiguitycheck.q-join13.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-create_or_replace_view.q-join_cond_pushdown_3.q-struct_in_view.q-and-12-more
 - did not produce a 

[jira] [Commented] (HIVE-10365) First job fails with StackOverflowError [Spark Branch]

2016-03-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215435#comment-15215435
 ] 

Szehon Ho commented on HIVE-10365:
--

I happened to see this in a spark-executor during a query as well.  Just 
leaving a note in case someone else hits this.

The solution is to set spark.executor.extraJavaOptions to a sufficiently high 
-Xss value.

> First job fails with StackOverflowError [Spark Branch]
> --
>
> Key: HIVE-10365
> URL: https://issues.apache.org/jira/browse/HIVE-10365
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> When running some queries on Yarn with standalone Hadoop, the first query 
> fails with StackOverflowError:
> {noformat}
> java.lang.StackOverflowError
>   at 
> java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333)
>   at 
> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1145)
>   at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:412)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215369#comment-15215369
 ] 

Pengcheng Xiong commented on HIVE-13372:


[~big60], could u apply the patch and try? ccing [~ashutoshc], could u please 
take a look? Thanks.

> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-13372.01.patch
>
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13372:
---
Status: Patch Available  (was: Open)

> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-13372.01.patch
>
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13372:
---
Attachment: HIVE-13372.01.patch

> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-13372.01.patch
>
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13372:
---
Attachment: HIVE-13372.01.patch

> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13373) Use most specific type for numerical constants

2016-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13373:

Status: Patch Available  (was: Open)

> Use most specific type for numerical constants
> --
>
> Key: HIVE-13373
> URL: https://issues.apache.org/jira/browse/HIVE-13373
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13373.patch
>
>
> tinyint & shortint are currently inferred as ints, if they are without 
> postfix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13373) Use most specific type for numerical constants

2016-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13373:

Attachment: HIVE-13373.patch

> Use most specific type for numerical constants
> --
>
> Key: HIVE-13373
> URL: https://issues.apache.org/jira/browse/HIVE-13373
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13373.patch
>
>
> tinyint & shortint are currently inferred as ints, if they are without 
> postfix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13268) Add a HA mini cluster type in MiniHS2

2016-03-28 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HIVE-13268:

Attachment: HIVE-13268.3.patch

[~sershe]
Thanks for the review.
TestPigHBaseStorageHandler was passed in my local computer. I uploaded a new 
patch and want to see new Jenkins results. It includes a new unit test class 
which is cloned from TestJdbcWithMiniMR.

> Add a HA mini cluster type in MiniHS2
> -
>
> Key: HIVE-13268
> URL: https://issues.apache.org/jira/browse/HIVE-13268
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Minor
> Attachments: HIVE-13268.1.patch, HIVE-13268.2.patch, 
> HIVE-13268.3.patch
>
>
> We need a HA mini cluster for unit tests. This jira is for implimenting that 
> in MiniHS2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13130) HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13130:
-
Attachment: HIVE-13130.4.patch

>  HS2 changes : API calls for retrieving primary keys and foreign keys 
> information
> -
>
> Key: HIVE-13130
> URL: https://issues.apache.org/jira/browse/HIVE-13130
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, 
> HIVE-13130.3.patch, HIVE-13130.4.patch
>
>
> ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes 
> getPrimaryKeys and getCrossReference API calls. We need to provide these 
> interfaces as part of PK/FK implementation in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Affects Version/s: 2.0.0

> Fix test failure of testHasNull in TestColumnStatistics running on Windows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Fix Version/s: 2.1.0

> Fix test failure of testHasNull in TestColumnStatistics running on Windows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Fix test failure of testHasNull in TestColumnStatistics running on Windows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215084#comment-15215084
 ] 

Hive QA commented on HIVE-13349:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795730/HIVE-13349.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 4 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/132/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/132/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-132/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Tests executed.
LXC oracle found.
LXC oracle is not started. Starting container...
Container started.
Preparing oracle container...
Container prepared.
Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/oracle/execute.sh ...
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795730 - PreCommit-HIVE-METASTORE-Test

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
Summary: Metastore Changes : API calls for retrieving primary keys and 
foreign keys information  (was: Metastore Changes : HS2 changes : API calls for 
retrieving primary keys and foreign keys information)

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
Attachment: (was: HIVE-13349.1.patch)

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
Attachment: HIVE-13349.1.patch

> Metastore Changes : API calls for retrieving primary keys and foreign keys 
> information
> --
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows

2016-03-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215070#comment-15215070
 ] 

Pengcheng Xiong commented on HIVE-13371:


OK. pushed to master.

> Fix test failure of testHasNull in TestColumnStatistics running on Windows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Lin Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215062#comment-15215062
 ] 

Lin Liu commented on HIVE-13372:


Thanks, Pengcheng.

Just FYI:
After some investigations, we found that the error is not related to sort by, 
because even without SORTBY clause, if the data size of the table is 
sufficiently large to launch a MR job, the result is still wrong.


> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215060#comment-15215060
 ] 

Hive QA commented on HIVE-13349:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795726/HIVE-13349.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 4 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/131/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/131/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-131/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Tests executed.
LXC oracle found.
LXC oracle is not started. Starting container...
Container started.
Preparing oracle container...
Container prepared.
Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/oracle/execute.sh ...
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795726 - PreCommit-HIVE-METASTORE-Test

> Metastore Changes : HS2 changes : API calls for retrieving primary keys and 
> foreign keys information
> 
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Summary: Fix test failure of testHasNull in TestColumnStatistics running on 
Windows  (was: Fix test failure of testHasNull in TestColumnStatistics running 
on WIndows)

> Fix test failure of testHasNull in TestColumnStatistics running on Windows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Aleksei Statkevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksei Statkevich updated HIVE-13372:
--
Description: 
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
{code}
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
{code}

When we ran the following query, 
{code}
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;
{code}

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3

Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
HIVE-12277 patches.

  was:
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;

When we ran the following query, 
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3

Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
HIVE-12277 patches.


> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> {code}
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> {code}
> When we ran the following query, 
> {code}
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> {code}
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows

2016-03-28 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215051#comment-15215051
 ] 

Ashutosh Chauhan commented on HIVE-13371:
-

+1 test only changes need to no go through Hive QA cycle.

> Fix test failure of testHasNull in TestColumnStatistics running on WIndows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Status: Patch Available  (was: Open)

> Fix test failure of testHasNull in TestColumnStatistics running on WIndows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13371.01.patch
>
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
Attachment: HIVE-13349.1.patch

> Metastore Changes : HS2 changes : API calls for retrieving primary keys and 
> foreign keys information
> 
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215045#comment-15215045
 ] 

Pengcheng Xiong commented on HIVE-13372:


Interesting. I will take a look.

> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> When we ran the following query, 
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13349:
-
Status: Patch Available  (was: Open)

> Metastore Changes : HS2 changes : API calls for retrieving primary keys and 
> foreign keys information
> 
>
> Key: HIVE-13349
> URL: https://issues.apache.org/jira/browse/HIVE-13349
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13349.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Lin Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HIVE-13372:
---
Description: 
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;

When we ran the following query, 
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3

Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
HIVE-12277 patches.

  was:
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;

When we ran the following query, 
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3

Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
Hive-12277 patches.


> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> When we ran the following query, 
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> HIVE-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column

2016-03-28 Thread Lin Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HIVE-13372:
---
Description: 
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;

When we ran the following query, 
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3

Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
Hive-12277 patches.

  was:
When multiple macros are used in one column, results of the later ones are over 
written by that of the first.

For example:
Suppose we have created a table called macro_test with single column x in 
STRING type, and with data as:
"a"
"bb"
"ccc"

We also create three macros:
CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;

When we ran the following query, 
SELECT
CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
STRING_LEN_PLUS_TWO(x)) a
FROM macro_test
SORT BY a DESC;

We get result:
3:3:3
2:2:2
1:1:1

instead of expected:
3:4:5
2:3:4
1:2:3


> Hive Macro overwritten when multiple macros are used in one column
> --
>
> Key: HIVE-13372
> URL: https://issues.apache.org/jira/browse/HIVE-13372
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Lin Liu
>Priority: Critical
>
> When multiple macros are used in one column, results of the later ones are 
> over written by that of the first.
> For example:
> Suppose we have created a table called macro_test with single column x in 
> STRING type, and with data as:
> "a"
> "bb"
> "ccc"
> We also create three macros:
> CREATE TEMPORARY MACRO STRING_LEN(x string) length(x);
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1;
> CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2;
> When we ran the following query, 
> SELECT
> CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", 
> STRING_LEN_PLUS_TWO(x)) a
> FROM macro_test
> SORT BY a DESC;
> We get result:
> 3:3:3
> 2:2:2
> 1:1:1
> instead of expected:
> 3:4:5
> 2:3:4
> 1:2:3
> Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and 
> Hive-12277 patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2016-03-28 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-12290:
---
Labels:   (was: TODOC2.0)

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2016-03-28 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215034#comment-15215034
 ] 

Shannon Ladymon commented on HIVE-12290:


Doc done:
* [Configuration Properties - hive.vectorized.execution.reducesink.new.enabled 
| 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.execution.reducesink.new.enabled]

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join

2016-03-28 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215008#comment-15215008
 ] 

Shannon Ladymon commented on HIVE-9824:
---

[~mmccline], just wanted to check in once more about whether the description 
for *hive.vectorized.execution.mapjoin.minmax.enabled* should read "max/max" or 
"min/max".

> LLAP: Native Vectorization of Map Join
> --
>
> Key: HIVE-9824
> URL: https://issues.apache.org/jira/browse/HIVE-9824
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
> HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, 
> HIVE-9824.08.patch, HIVE-9824.09.patch
>
>
> Today's VectorMapJoinOperator is a pass-through that converts each row from a 
> vectorized row batch in a Java Object[] row and passes it to the 
> MapJoinOperator superclass.
> This enhancement creates specialized vectorized map join operator classes 
> that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13370:

Priority: Minor  (was: Major)

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows

2016-03-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13371:
---
Description: 
As per [~prasanth_j]'s analysis, 
{code}
The ColumnStatistics test failures are already known to fail in Windows. This 
is mostly a file size difference which is not a product issue and can be 
ignored safely. The reason for the failure is ORC stripe footer stores the 
timezone ID as string in stripe footer. Since Linux and Windows produces 
different timezone id string, the size of the stripe footer will change 
accordingly. Because of this timezone difference the file sizes will be 
different on windows and linux. We can either update the orc-file-has-null.out 
output file on Windows or ignore this test altogether.
{code}

> Fix test failure of testHasNull in TestColumnStatistics running on WIndows
> --
>
> Key: HIVE-13371
> URL: https://issues.apache.org/jira/browse/HIVE-13371
> Project: Hive
>  Issue Type: Test
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
>
> As per [~prasanth_j]'s analysis, 
> {code}
> The ColumnStatistics test failures are already known to fail in Windows. This 
> is mostly a file size difference which is not a product issue and can be 
> ignored safely. The reason for the failure is ORC stripe footer stores the 
> timezone ID as string in stripe footer. Since Linux and Windows produces 
> different timezone id string, the size of the stripe footer will change 
> accordingly. Because of this timezone difference the file sizes will be 
> different on windows and linux. We can either update the 
> orc-file-has-null.out output file on Windows or ignore this test altogether.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214935#comment-15214935
 ] 

Daniel Dai commented on HIVE-13370:
---

+1

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13281) Update some default configs for LLAP

2016-03-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-13281:
--
Attachment: HIVE-13281.2.patch

Updated patch to fix the test failures.

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-13281.1.patch, HIVE-13281.2.patch
>
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)

2016-03-28 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214922#comment-15214922
 ] 

Matt McCline commented on HIVE-10729:
-

Thank you for the review.

> Query failed when select complex columns from joinned table (tez map join 
> only)
> ---
>
> Key: HIVE-10729
> URL: https://issues.apache.org/jira/browse/HIVE-10729
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Matt McCline
> Attachments: HIVE-10729.03.patch, HIVE-10729.04.patch, 
> HIVE-10729.1.patch, HIVE-10729.2.patch
>
>
> When map join happens, if projection columns include complex data types, 
> query will fail. 
> Steps to reproduce:
> {code:sql}
> hive> set hive.auto.convert.join;
> hive.auto.convert.join=true
> hive> desc foo;
> a array
> hive> select * from foo;
> [1,2]
> hive> desc src_int;
> key   int
> value string
> hive> select * from src_int where key=2;
> 2val_2
> hive> select * from foo join src_int src  on src.key = foo.a[1];
> {code}
> Query will fail with stack trace
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246)
>   at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
>   ... 23 more
> {noformat}
> Similar error when projection columns include a map:
> {code:sql}
> hive> CREATE TABLE test (a INT, b MAP) STORED AS ORC;
> hive> INSERT OVERWRITE TABLE test SELECT 1, MAP(1, "val_1", 2, "val_2") FROM 
> src LIMIT 1;
> hive> select * from src join test where src.key=test.a;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862

2016-03-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13111:

Attachment: HIVE-13111.07.patch

Reduce range of random timestamp produced due to HIVE-12531.

> Fix timestamp / interval_day_time wrong results with HIVE-9862 
> ---
>
> Key: HIVE-13111
> URL: https://issues.apache.org/jira/browse/HIVE-13111
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch, 
> HIVE-13111.03.patch, HIVE-13111.04.patch, HIVE-13111.05.patch, 
> HIVE-13111.06.patch, HIVE-13111.07.patch
>
>
> Fix timestamp / interval_day_time issues discovered when testing the 
> Vectorized Text patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11936) Support SQLAnywhere as a backing DB for the hive metastore

2016-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11936:

Assignee: (was: Sushanth Sowmyan)

> Support SQLAnywhere as a backing DB for the hive metastore
> --
>
> Key: HIVE-11936
> URL: https://issues.apache.org/jira/browse/HIVE-11936
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>
> I've had pings from people interested in enabling the metastore to work on 
> top of SQLAnywhere (17+), and thus, opening this jira to track changes needed 
> in hive to make SQLAnywhere work as a backing db for the metastore.
> I have it working and passing all tests currently in my setup, and will 
> upload patches as I'm able to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11936) Support SQLAnywhere as a backing DB for the hive metastore

2016-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214869#comment-15214869
 ] 

Sushanth Sowmyan commented on HIVE-11936:
-

Opening this up as unassigned - the changes required changes to DN for a new 
SQLAnywhere adapter, and also more changes that I'd not been able to test 
enough, and haven't followed up on for a while. If someone else wants to take 
this on, please go ahead.

> Support SQLAnywhere as a backing DB for the hive metastore
> --
>
> Key: HIVE-11936
> URL: https://issues.apache.org/jira/browse/HIVE-11936
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> I've had pings from people interested in enabling the metastore to work on 
> top of SQLAnywhere (17+), and thus, opening this jira to track changes needed 
> in hive to make SQLAnywhere work as a backing db for the metastore.
> I have it working and passing all tests currently in my setup, and will 
> upload patches as I'm able to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214840#comment-15214840
 ] 

Mithun Radhakrishnan commented on HIVE-13370:
-

Thanks for adding the test, [~sushanth]. 
+1. Looks good.

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13370:

Status: Patch Available  (was: Open)

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13370:

Attachment: HIVE-13370.patch

Patch attached.

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file

2016-03-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13369:
--
Description: 
The JavaDoc on getAcidState() reads, in part:

"Note that because major compactions don't
   preserve the history, we can't use a base directory that includes a
   transaction id that we must exclude."

which is correct but there is nothing in the code that does this.

  was:
The JavaDoc on getAcidState() reads, in part:

"Note that because major compactions don't
   * preserve the history, we can't use a base directory that includes a
   * transaction id that we must exclude."

which is correct but there is nothing in the code that does this.


> AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing 
> the "best" base file
> --
>
> Key: HIVE-13369
> URL: https://issues.apache.org/jira/browse/HIVE-13369
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
>
> The JavaDoc on getAcidState() reads, in part:
> "Note that because major compactions don't
>preserve the history, we can't use a base directory that includes a
>transaction id that we must exclude."
> which is correct but there is nothing in the code that does this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214833#comment-15214833
 ] 

Sushanth Sowmyan commented on HIVE-13370:
-

[~mithun]/[~daijy], could you please review?

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13370) Add test for HIVE-11470

2016-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13370:

Description: HIVE-11470 added capability to handle NULL dynamic 
partitioning keys properly. However, it did not add a test for the case, we 
should have one so we don't have future regressions of the same.

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214825#comment-15214825
 ] 

Sergey Shelukhin commented on HIVE-13345:
-

I think the problem is/was that ORC readers were created with proto objects. 
Anyway, I'll take a look at how complex both approaches are at some point (this 
week?)

> LLAP: metadata cache takes too much space, esp. with bloom filters, due to 
> Java/protobuf overhead
> -
>
> Key: HIVE-13345
> URL: https://issues.apache.org/jira/browse/HIVE-13345
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> We cache java objects currently; these have high overhead, average stripe 
> metadata takes 200-500Kb on real files, and with bloom filters blowing up 
> more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. 
> That is undesirable.
> We should either create better objects for ORC (might be good in general) or 
> store serialized metadata and deserialize when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead

2016-03-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214816#comment-15214816
 ] 

Prasanth Jayachandran commented on HIVE-13345:
--

IMO we should store the serialized representation of metadata. Deserialized 
representation of metadata (Proto objects) are supposed to be short-lived. We 
have POJOs for all protobuf equivalents. BloomFilter, ColumnStatistics, 
StripeInformation etc. which creates POJOs from Proto objects. If we are 
caching the deserialized representation then we should cache the equivalent 
POJOs and not the proto objects.

> LLAP: metadata cache takes too much space, esp. with bloom filters, due to 
> Java/protobuf overhead
> -
>
> Key: HIVE-13345
> URL: https://issues.apache.org/jira/browse/HIVE-13345
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> We cache java objects currently; these have high overhead, average stripe 
> metadata takes 200-500Kb on real files, and with bloom filters blowing up 
> more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. 
> That is undesirable.
> We should either create better objects for ORC (might be good in general) or 
> store serialized metadata and deserialize when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9660:
---
Attachment: (was: HIVE-9660.WIP2.patch)

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.01.patch, HIVE-9660.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9660:
---
Attachment: HIVE-9660.01.patch

Rebased the patch and fixed some small issue.

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.01.patch, HIVE-9660.WIP2.patch, 
> HIVE-9660.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214796#comment-15214796
 ] 

Gopal V commented on HIVE-12049:


MaxRows = 1000

!old-driver-profiles.png!

The hot codepath with the new driver is 

{code}
 Stacks at 2016-03-28 01:10:19 PM (uptime 7m 58 sec)

 faeb41dd-3869-40cc-860b-748f505d5565 eab06890-8bb8-478f-877a-9282f5b4d64e 
HiveServer2-Handler-Pool: Thread-788 [RUNNABLE]
*** java.util.concurrent.ConcurrentHashMap.putAll(Map) 
ConcurrentHashMap.java:1084
*** java.util.concurrent.ConcurrentHashMap.(Map) 
ConcurrentHashMap.java:852
*** org.apache.hadoop.conf.Configuration.(Configuration) 
Configuration.java:713
*** org.apache.hadoop.hive.conf.HiveConf.(HiveConf) HiveConf.java:3460
*** org.apache.hive.service.cli.operation.SQLOperation.getConfigForOperation() 
SQLOperation.java:529
*** 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(FetchOrientation,
 long) SQLOperation.java:360
*** 
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationHandle,
 FetchOrientation, long) OperationManager.java:280
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(OperationHandle,
 FetchOrientation, long, FetchType) HiveSessionImpl.java:786
org.apache.hive.service.cli.CLIService.fetchResults(OperationHandle, 
FetchOrientation, long, FetchType) CLIService.java:452
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(TFetchResultsReq)
 ThriftCLIService.java:743
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService$Iface,
 TCLIService$FetchResults_args) TCLIService.java:1557
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(Object,
 TBase) TCLIService.java:1542
org.apache.thrift.ProcessFunction.process(int, TProtocol, TProtocol, Object) 
ProcessFunction.java:39
org.apache.thrift.TBaseProcessor.process(TProtocol, TProtocol) 
TBaseProcessor.java:39
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TProtocol, 
TProtocol) TSetIpAddressProcessor.java:56
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() 
TThreadPoolServer.java:286
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
ThreadPoolExecutor.java:1142
java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:617
java.lang.Thread.run() Thread.java:745
{code}

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12049:
---
Attachment: old-driver-profiles.png

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-1718) Implement SerDe for processing fixed length data

2016-03-28 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1718:
-
Assignee: (was: Shreepadma Venugopalan)

> Implement SerDe for processing fixed length data
> 
>
> Key: HIVE-1718
> URL: https://issues.apache.org/jira/browse/HIVE-1718
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Carl Steinbach
>
> Fixed length fields are pretty common in legacy data formats. While it is 
> already
> possible to process these files using the RegexSerDe, they could be more 
> efficiently
> handled using a SerDe that is specifically crafted for reading/writing fixed 
> length
> fields. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13350) Support Alter commands for Rely/NoRely novalidate for PK/FK constraints

2016-03-28 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214774#comment-15214774
 ] 

Alan Gates commented on HIVE-13350:
---

Does this mean you want an alter command that can add a PK or FK?  Or do you 
want to be able to add/drop the rely/no rely options?  The latter doesn't make 
any sense since we have no ability to validate a PK or FK.

> Support Alter commands for Rely/NoRely  novalidate for PK/FK constraints
> 
>
> Key: HIVE-13350
> URL: https://issues.apache.org/jira/browse/HIVE-13350
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-03-28 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214772#comment-15214772
 ] 

Alan Gates commented on HIVE-13290:
---

Could you post a version of the patch without the generated code for easier 
review?

> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id) 
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id) 
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13364) Allow llap to work with dynamic ports for rpc, shuffle, ui

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214771#comment-15214771
 ] 

Hive QA commented on HIVE-13364:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795524/HIVE-13364.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9882 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7398/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7398/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7398/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795524 - PreCommit-HIVE-TRUNK-Build

> Allow llap to work with dynamic ports for rpc, shuffle, ui
> --
>
> Key: HIVE-13364
> URL: https://issues.apache.org/jira/browse/HIVE-13364
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-13364.1.patch
>
>
> At the moment - the ports specified in the configuration are the ones which 
> are used to register with the Zookeeper service. Setting the ports to 0 
> effectively means that the services cannot be discovered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for

2016-03-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10249:
--
Attachment: HIVE-10249.3.patch

patch 3 fixes the test issus

> ACID: show locks should show who the lock is waiting for
> 
>
> Key: HIVE-10249
> URL: https://issues.apache.org/jira/browse/HIVE-10249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-10249.2.patch, HIVE-10249.3.patch, HIVE-10249.patch
>
>
> instead of just showing state WAITING, we should include what the lock is 
> waiting for.  It will make diagnostics easier.
> It would also be useful to add QueryPlan.getQueryId() so it's easy to see 
> which query the lock belongs to.
> # need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
> do another update on failed attempt and to clear filed on successful attempt. 
>  (Actually on success, we update anyway).  How exactly would this be 
> displayed?  Each lock can block but we acquire all parts of external lock at 
> once.  Since we stop at first one that blocked, we’d only update that one…
> # This needs a matching Thrift change to pass to client: ShowLocksResponse
> # Perhaps we can start updating this info after lock was in W state for some 
> time to reduce perf hit.
> # This is mostly useful for “Why is my query stuck”



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-28 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214754#comment-15214754
 ] 

Alan Gates commented on HIVE-11388:
---

bq. Does this need to be documented in the wiki for releases 1.3.0 and 2.1.0?
I am not sure.  [~ekoifman] is this all of the work needed to make it possible 
to run multiple initiators and cleaners, or just part of it?  Have we tested 
running them in multiple metastores?  If the answer to those is yes, then the 
answer to [~leftylev]'s question is: "definitely".

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, 
> HIVE-11388.branch-1.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214728#comment-15214728
 ] 

Sergey Shelukhin commented on HIVE-13345:
-

[~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am 
leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an 
alternative we can change metadata cache in LLAP to store serialized metadata. 
The cost of deserializing every time in LLAP vs the cost of copying 
fields/converting some things (e.g. OrcProto stores bloom filters as 
List, which aside from being horrible on pure merits, offends my 
engineering sensibilities, so I might be biased here).


> LLAP: metadata cache takes too much space, esp. with bloom filters, due to 
> Java/protobuf overhead
> -
>
> Key: HIVE-13345
> URL: https://issues.apache.org/jira/browse/HIVE-13345
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> We cache java objects currently; these have high overhead, average stripe 
> metadata takes 200-500Kb on real files, and with bloom filters blowing up 
> more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. 
> That is undesirable.
> We should either create better objects for ORC (might be good in general) or 
> store serialized metadata and deserialize when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214728#comment-15214728
 ] 

Sergey Shelukhin edited comment on HIVE-13345 at 3/28/16 7:27 PM:
--

[~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am 
leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an 
alternative we can change metadata cache in LLAP to store serialized metadata. 
The cost of deserializing every time in LLAP vs the cost of copying 
fields/converting some things (e.g. OrcProto stores bloom filters as 
List, which aside from being horrible on purely practical grounds, 
offends my engineering sensibilities, so I might be biased here).



was (Author: sershe):
[~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am 
leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an 
alternative we can change metadata cache in LLAP to store serialized metadata. 
The cost of deserializing every time in LLAP vs the cost of copying 
fields/converting some things (e.g. OrcProto stores bloom filters as 
List, which aside from being horrible on pure merits, offends my 
engineering sensibilities, so I might be biased here).


> LLAP: metadata cache takes too much space, esp. with bloom filters, due to 
> Java/protobuf overhead
> -
>
> Key: HIVE-13345
> URL: https://issues.apache.org/jira/browse/HIVE-13345
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> We cache java objects currently; these have high overhead, average stripe 
> metadata takes 200-500Kb on real files, and with bloom filters blowing up 
> more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. 
> That is undesirable.
> We should either create better objects for ORC (might be good in general) or 
> store serialized metadata and deserialize when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13332) support dumping all row indexes in ORC FileDump

2016-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13332:

Attachment: HIVE-13332.01.patch

Updated the patch with the out file changes for MiniTez... 

> support dumping all row indexes in ORC FileDump
> ---
>
> Key: HIVE-13332
> URL: https://issues.apache.org/jira/browse/HIVE-13332
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13332.01.patch, HIVE-13332.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-28 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: HIVE-13149.6.patch

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-28 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: (was: HIVE-13149.6.patch)

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect

2016-03-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-12992:
--
   Resolution: Fixed
Fix Version/s: 2.0.1
   2.1.0
   1.2.2
   Status: Resolved  (was: Patch Available)

> Hive on tez: Bucket map join plan is incorrect
> --
>
> Key: HIVE-12992
> URL: https://issues.apache.org/jira/browse/HIVE-12992
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: tez
> Fix For: 1.2.2, 2.1.0, 2.0.1
>
> Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch
>
>
> TPCH Query 9 fails when bucket map join is enabled:
> {code}
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 
> 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in 
> EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to 
> sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, 
> targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, 
> taskVertexName=Map 1, edgeVertexName=Reducer 5, 
> taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, 
> destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, 
> destinationVertexName=Reducer 5, java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13361) Orc concatenation should enforce the compression buffer size

2016-03-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13361:
-
Attachment: HIVE-13361.1.patch

Not sure why precommit did not pick up the patch. Reuploading again.

> Orc concatenation should enforce the compression buffer size
> 
>
> Key: HIVE-13361
> URL: https://issues.apache.org/jira/browse/HIVE-13361
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Yi Zhang
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13361.1.patch, HIVE-13361.1.patch, alltypesorc3xcols
>
>
> With HIVE-11807 buffer size estimation happens by default. This can have 
> undesired effect wrt file concatenation. Consider the following table with 
> files
> {code}
> testtable
>   -- 00_0 (created before HIVE-11807 which has buffer size 256KB)
>   -- 01_0 (created before HIVE-11807 which has buffer size 256KB)
>   -- 02_0 (created after HIVE-11807 with buffer size chosen as 128KB)
>   -- 03_0 (created after HIVE-11807 with buffer size chosen as 128KB)
> {code}
> If we perform ALTER TABLE .. CONCATENATE on the above table with HIVE-11807, 
> then depending on the split arrangement 00_0 and 01_0 will be 
> concatenated together to new merged file. But this new merged file will have 
> 128KB buffer size (estimated buffer size and not requested buffer size). 
> Since new ORC writer size does not honor the requested buffer size the new 
> merged files will have smaller buffers than the required 256KB making the 
> file unreadable. Following exception will be thrown when reading the table 
> after concatenation
> {code}
> 2016-03-24T16:26:33,974 ERROR [a9e27a9a-37cb-411d-9708-6c58a4ce34f2 main]: 
> CliDriver (SessionState.java:printError(1049)) - Failed with exception 
> java.io.IOException:java.lang.IllegalArgumentException: Buffer size too 
> small. size = 131072 needed = 153187
> java.io.IOException: java.lang.IllegalArgumentException: Buffer size too 
> small. size = 131072 needed = 153187
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:513)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:420)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:145)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1848)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:721)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13326) HiveServer2: Make ZK config publishing configurable

2016-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214672#comment-15214672
 ] 

Thejas M Nair commented on HIVE-13326:
--

+1
Thanks for creating the tests, we can now build on this for future service 
discovery patches!


> HiveServer2: Make ZK config publishing configurable
> ---
>
> Key: HIVE-13326
> URL: https://issues.apache.org/jira/browse/HIVE-13326
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-13326.1.patch, HIVE-13326.2.patch
>
>
> We should revert to older behaviour when config publishing is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect

2016-03-28 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214647#comment-15214647
 ] 

Vikram Dixit K commented on HIVE-12992:
---

The bucket_map_join test failure is related. It is a golden file update that I 
missed. Posting a new patch here with golden file update for it.

> Hive on tez: Bucket map join plan is incorrect
> --
>
> Key: HIVE-12992
> URL: https://issues.apache.org/jira/browse/HIVE-12992
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: tez
> Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch
>
>
> TPCH Query 9 fails when bucket map join is enabled:
> {code}
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 
> 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in 
> EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to 
> sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, 
> targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, 
> taskVertexName=Map 1, edgeVertexName=Reducer 5, 
> taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, 
> destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, 
> destinationVertexName=Reducer 5, java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect

2016-03-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-12992:
--
Attachment: HIVE-12992.2.patch

> Hive on tez: Bucket map join plan is incorrect
> --
>
> Key: HIVE-12992
> URL: https://issues.apache.org/jira/browse/HIVE-12992
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: tez
> Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch
>
>
> TPCH Query 9 fails when bucket map join is enabled:
> {code}
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 
> 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in 
> EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to 
> sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, 
> targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, 
> taskVertexName=Map 1, edgeVertexName=Reducer 5, 
> taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, 
> destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, 
> destinationVertexName=Reducer 5, java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386)
>   at 
> org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172)
>   at 
> org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214642#comment-15214642
 ] 

Hive QA commented on HIVE-13290:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795670/HIVE-13290.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 4 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/130/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/130/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-130/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Tests executed.
LXC oracle found.
LXC oracle is not started. Starting container...
Container started.
Preparing oracle container...
Container prepared.
Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/oracle/execute.sh ...
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795670 - PreCommit-HIVE-METASTORE-Test

> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id) 
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id) 
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214643#comment-15214643
 ] 

Thejas M Nair commented on HIVE-12049:
--

[~gopalv]
Thanks for profiling it! 
What is it like without the optimization ?
What is the JDBC fetchRowSize being used ?

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore

2016-03-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214636#comment-15214636
 ] 

Pengcheng Xiong commented on HIVE-12960:


[~sershe], thanks for your attention. The sql part is not removed and my 
analysis is posted at 09/Feb/16 22:53. Could u please scroll up your mouse and 
take a look? :) 

> Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore
> 
>
> Key: HIVE-12960
> URL: https://issues.apache.org/jira/browse/HIVE-12960
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12960.01.patch, HIVE-12960.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214635#comment-15214635
 ] 

Sergey Shelukhin commented on HIVE-10729:
-

Didn't look at the test file, I assume it's the same as w/o vector :)
posSingleVectorMapJoinSmallTable - assumes there are two elements in the array, 
right? Should there be an assert? 
Looks good pending tests otherwise. +1

> Query failed when select complex columns from joinned table (tez map join 
> only)
> ---
>
> Key: HIVE-10729
> URL: https://issues.apache.org/jira/browse/HIVE-10729
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Matt McCline
> Attachments: HIVE-10729.03.patch, HIVE-10729.04.patch, 
> HIVE-10729.1.patch, HIVE-10729.2.patch
>
>
> When map join happens, if projection columns include complex data types, 
> query will fail. 
> Steps to reproduce:
> {code:sql}
> hive> set hive.auto.convert.join;
> hive.auto.convert.join=true
> hive> desc foo;
> a array
> hive> select * from foo;
> [1,2]
> hive> desc src_int;
> key   int
> value string
> hive> select * from src_int where key=2;
> 2val_2
> hive> select * from foo join src_int src  on src.key = foo.a[1];
> {code}
> Query will fail with stack trace
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262)
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246)
>   at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
>   ... 23 more
> {noformat}
> Similar error when projection columns include a map:
> {code:sql}
> hive> CREATE TABLE test (a INT, b MAP) STORED AS ORC;
> hive> INSERT OVERWRITE TABLE test SELECT 1, MAP(1, "val_1", 2, "val_2") FROM 
> src LIMIT 1;
> hive> select * from src join test where src.key=test.a;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12937) DbNotificationListener unable to clean up old notification events

2016-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214633#comment-15214633
 ] 

Sushanth Sowmyan commented on HIVE-12937:
-

None of the test failures seem to be related to this patch. [~alangates], could 
you please review?

> DbNotificationListener unable to clean up old notification events
> -
>
> Key: HIVE-12937
> URL: https://issues.apache.org/jira/browse/HIVE-12937
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12937.patch
>
>
> There is a bug in ObjectStore, where we use pm.deletePersistent instead of 
> pm.deletePersistentAll, which causes the persistenceManager to try and drop a 
> org.datanucleus.store.rdbms.query.ForwardQueryResult instead of the 
> appropriate associated 
> org.apache.hadoop.hive.metastore.model.MNotificationLog.
> This results in an error that looks like this:
> {noformat}
> Exception in thread "CleanerThread" 
> org.datanucleus.api.jdo.exceptions.ClassNotPersistenceCapableException: The 
> class "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not 
> persistable. This means that it either hasnt been enhanced, or that the 
> enhanced version of the file is not in the CLASSPATH (or is hidden by an 
> unenhanced version), or the Meta-Data/annotations for the class are not found.
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:380)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoDeletePersistent(JDOPersistenceManager.java:807)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.deletePersistent(JDOPersistenceManager.java:820)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.cleanNotificationEvents(ObjectStore.java:7149)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy0.cleanNotificationEvents(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$CleanerThread.run(DbNotificationListener.java:277)
> NestedThrowablesStackTrace:
> The class "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not 
> persistable. This means that it either hasnt been enhanced, or that the 
> enhanced version of the file is not in the CLASSPATH (or is hidden by an 
> unenhanced version), or the Meta-Data/annotations for the class are not found.
> org.datanucleus.exceptions.ClassNotPersistableException: The class 
> "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not persistable. 
> This means that it either hasnt been enhanced, or that the enhanced version 
> of the file is not in the CLASSPATH (or is hidden by an unenhanced version), 
> or the Meta-Data/annotations for the class are not found.
> at 
> org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5698)
> at 
> org.datanucleus.ExecutionContextImpl.deleteObjectInternal(ExecutionContextImpl.java:2495)
> at 
> org.datanucleus.ExecutionContextImpl.deleteObjectWork(ExecutionContextImpl.java:2466)
> at 
> org.datanucleus.ExecutionContextImpl.deleteObject(ExecutionContextImpl.java:2417)
> at 
> org.datanucleus.ExecutionContextThreadedImpl.deleteObject(ExecutionContextThreadedImpl.java:245)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoDeletePersistent(JDOPersistenceManager.java:802)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.deletePersistent(JDOPersistenceManager.java:820)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.cleanNotificationEvents(ObjectStore.java:7149)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy0.cleanNotificationEvents(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$CleanerThread.run(DbNotificationListener.java:277)
> {noformat}
> The end result of this bug is that users of DbNotificationListener will have 
> an evergrowing number of notification events that are not cleaned up as they 
> age. This is an easy enough fix, but shows that we have a lack of test 
> coverage here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster

2016-03-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214631#comment-15214631
 ] 

Siddharth Seth commented on HIVE-13365:
---

At the moment, there's no way to do that. It's also pointless since the tests 
are not big enough to run multiple instances. I guess we could add an option to 
the testrunner to control the number of instances. Will create a follow up jira 
for that.
I intend to use the multiple instance feature in some failure handling tests in 
the main codebase.

> Allow multiple llap instances with the MiniLlap cluster
> ---
>
> Key: HIVE-13365
> URL: https://issues.apache.org/jira/browse/HIVE-13365
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-13365.01.patch, HIVE-13365.1.review.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214626#comment-15214626
 ] 

Sergey Shelukhin commented on HIVE-12960:
-

Hmm... I don't see sql part removed. Was it just copy/pasted?

> Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore
> 
>
> Key: HIVE-12960
> URL: https://issues.apache.org/jira/browse/HIVE-12960
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12960.01.patch, HIVE-12960.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13346) LLAP doesn't update metadata priority when reusing from cache; some tweaks in LRFU policy

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214628#comment-15214628
 ] 

Sergey Shelukhin commented on HIVE-13346:
-

[~sseth] maybe you can review?

> LLAP doesn't update metadata priority when reusing from cache; some tweaks in 
> LRFU policy
> -
>
> Key: HIVE-13346
> URL: https://issues.apache.org/jira/browse/HIVE-13346
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13346.01.patch, HIVE-13346.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster

2016-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214621#comment-15214621
 ] 

Sergey Shelukhin commented on HIVE-13365:
-

+1 pending tests. How does one trigger multiple instances?

> Allow multiple llap instances with the MiniLlap cluster
> ---
>
> Key: HIVE-13365
> URL: https://issues.apache.org/jira/browse/HIVE-13365
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-13365.01.patch, HIVE-13365.1.review.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13290:
-
Attachment: (was: HIVE-13290.2.patch)

> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id) 
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id) 
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive

2016-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13290:
-
Attachment: HIVE-13290.3.patch

> Support primary keys/foreign keys constraint as part of create table command 
> in Hive
> 
>
> Key: HIVE-13290
> URL: https://issues.apache.org/jira/browse/HIVE-13290
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, 
> HIVE-13290.3.patch
>
>
> SUPPORT for the following statements
> {code}
> CREATE TABLE product 
>   ( 
>  product_idINTEGER, 
>  product_vendor_id INTEGER, 
>  PRIMARY KEY (product_id), 
>  CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES 
> vendor(vendor_id) 
>   ); 
> CREATE TABLE vendor 
>   ( 
>  vendor_id INTEGER, 
>  PRIMARY KEY (vendor_id) 
>   ); 
> {code}
> In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not 
> specified by the user, we will use system generated constraint name. For the 
> purpose of simplicity, we will allow  CONSTRAINT option for foreign keys and 
> not primary key since there is only one primary key per table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12049:
---
Attachment: (was: new-driver-profiles.png)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214598#comment-15214598
 ] 

Gopal V commented on HIVE-12049:


Profiling the patch, most of the CPU in the fetchResults is now from the 
Session acquire and release.

!new-driver-profiles.png!

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12049:
---
Attachment: new-driver-profiles.png

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214528#comment-15214528
 ] 

Thejas M Nair commented on HIVE-12049:
--

The "not in list of params that are allowed to be modified at runtime" is 
happening because SQL std auth or Ranger is enabled, and it allows modifying 
configs only in a whitelist.
[~gopalv] A workaround is to add the parameter as value of 
hive.security.authorization.sqlstd.confwhitelist.append in HS2.

[~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks 
parameter to the default whiltelist. It should be added to   
sqlStdAuthSafeVarNames  array in HiveConf.java.


> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214528#comment-15214528
 ] 

Thejas M Nair edited comment on HIVE-12049 at 3/28/16 5:41 PM:
---

The "not in list of params that are allowed to be modified at runtime" is 
happening because SQL std auth or Ranger is enabled, and it allows modifying 
configs only in a whitelist.
[~gopalv] A workaround is to add the parameter as value of 
hive.security.authorization.sqlstd.confwhitelist.append in HS2, or disable 
authorization.

[~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks 
parameter to the default whiltelist. It should be added to   
sqlStdAuthSafeVarNames  array in HiveConf.java.



was (Author: thejas):
The "not in list of params that are allowed to be modified at runtime" is 
happening because SQL std auth or Ranger is enabled, and it allows modifying 
configs only in a whitelist.
[~gopalv] A workaround is to add the parameter as value of 
hive.security.authorization.sqlstd.confwhitelist.append in HS2.

[~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks 
parameter to the default whiltelist. It should be added to   
sqlStdAuthSafeVarNames  array in HiveConf.java.


> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13363) Add hive.metastore.token.signature property to HiveConf

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214459#comment-15214459
 ] 

Hive QA commented on HIVE-13363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795494/HIVE-13363.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7397/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7397/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7397/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[WARNING]   - 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim
[WARNING]   - 64 more...
[WARNING] hive-exec-2.1.0-SNAPSHOT.jar, snappy-0.2.jar define 16 overlappping 
classes: 
[WARNING]   - org.iq80.snappy.SnappyCompressor
[WARNING]   - org.iq80.snappy.SlowMemory
[WARNING]   - org.iq80.snappy.Crc32C
[WARNING]   - org.iq80.snappy.CorruptionException
[WARNING]   - org.iq80.snappy.UnsafeMemory
[WARNING]   - org.iq80.snappy.Memory
[WARNING]   - org.iq80.snappy.Snappy
[WARNING]   - org.iq80.snappy.Main
[WARNING]   - org.iq80.snappy.HadoopSnappyCodec
[WARNING]   - org.iq80.snappy.SnappyDecompressor
[WARNING]   - 6 more...
[WARNING] hive-serde-2.1.0-SNAPSHOT.jar, hive-exec-2.1.0-SNAPSHOT.jar define 
588 overlappping classes: 
[WARNING]   - 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector
[WARNING]   - org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters
[WARNING]   - org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeXception
[WARNING]   - org.apache.hadoop.hive.serde2.proto.test.Complexpb$Complex
[WARNING]   - org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase
[WARNING]   - 
org.apache.hadoop.hive.serde.test.ThriftTestObj$ThriftTestObjTupleSchemeFactory
[WARNING]   - org.apache.hadoop.hive.serde2.thrift.test.Complex$1
[WARNING]   - org.apache.hadoop.hive.serde2.thrift.test.MiniStruct
[WARNING]   - 
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector
[WARNING]   - 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$1
[WARNING]   - 578 more...
[WARNING] jackson-mapper-asl-1.9.2.jar, hive-exec-2.1.0-SNAPSHOT.jar define 494 
overlappping classes: 
[WARNING]   - org.codehaus.jackson.map.ser.impl.SerializerCache$TypeKey
[WARNING]   - org.codehaus.jackson.map.DeserializerProvider
[WARNING]   - org.codehaus.jackson.map.deser.std.StdKeyDeserializer$LongKD
[WARNING]   - org.codehaus.jackson.node.ValueNode
[WARNING]   - org.codehaus.jackson.map.ser.std.CollectionSerializer
[WARNING]   - org.codehaus.jackson.map.ser.impl.PropertySerializerMap$Double
[WARNING]   - org.codehaus.jackson.map.deser.FromStringDeserializer
[WARNING]   - org.codehaus.jackson.map.deser.std.StdKeyDeserializer$FloatKD
[WARNING]   - org.codehaus.jackson.map.Deserializers
[WARNING]   - org.codehaus.jackson.map.ser.StdSerializers$SerializableSerializer
[WARNING]   - 484 more...
[WARNING] hadoop-yarn-common-2.6.0.jar, hadoop-yarn-api-2.6.0.jar define 3 
overlappping classes: 
[WARNING]   - org.apache.hadoop.yarn.factories.package-info
[WARNING]   - org.apache.hadoop.yarn.util.package-info
[WARNING]   - org.apache.hadoop.yarn.factory.providers.package-info
[WARNING] commons-beanutils-core-1.8.0.jar, commons-beanutils-1.7.0.jar, 
commons-collections-3.2.2.jar define 10 overlappping classes: 
[WARNING]   - org.apache.commons.collections.FastHashMap$EntrySet
[WARNING]   - org.apache.commons.collections.ArrayStack
[WARNING]   - org.apache.commons.collections.FastHashMap$1
[WARNING]   - org.apache.commons.collections.FastHashMap$KeySet
[WARNING]   - org.apache.commons.collections.FastHashMap$CollectionView
[WARNING]   - org.apache.commons.collections.BufferUnderflowException
[WARNING]   - org.apache.commons.collections.Buffer
[WARNING]   - 
org.apache.commons.collections.FastHashMap$CollectionView$CollectionViewIterator
[WARNING]   - org.apache.commons.collections.FastHashMap$Values
[WARNING]   - org.apache.commons.collections.FastHashMap
[WARNING] hive-shims-0.23-2.1.0-SNAPSHOT.jar, hive-exec-2.1.0-SNAPSHOT.jar 
define 29 overlappping classes: 
[WARNING]   - 
org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsFileStatusWithIdImpl
[WARNING]   - org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim
[WARNING]   - org.apache.hadoop.hive.shims.Jetty23Shims$Server
[WARNING]   - org.apache.hadoop.mapred.WebHCatJTShim23
[WARNING]   - org.apache.hadoop.hive.shims.Hadoop23Shims$MiniTezShim
[WARNING]   - org.apache.hadoop.hive.shims.Jetty23Shims$1
[WARNING]   - org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23
[WARNING]   - org.apache.hadoop.hive.shims.Jetty23Shims
[WARNING]   - 

[jira] [Commented] (HIVE-12612) beeline always exits with 0 status when reading query from standard input

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214424#comment-15214424
 ] 

Hive QA commented on HIVE-12612:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795428/HIVE-12612.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9882 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7396/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7396/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7396/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795428 - PreCommit-HIVE-TRUNK-Build

> beeline always exits with 0 status when reading query from standard input
> -
>
> Key: HIVE-12612
> URL: https://issues.apache.org/jira/browse/HIVE-12612
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.1.0
> Environment: CDH5.5.0
>Reporter: Paulo Sequeira
>Assignee: Reuben Kuhnert
>Priority: Minor
> Attachments: HIVE-12612.01.patch, HIVE-12612.02.patch
>
>
> Similar to what was reported on HIVE-6978, but now it only happens when the 
> query is read from the standard input. For example, the following fails as 
> expected:
> {code}
> bash$ if beeline -u "jdbc:hive2://..." -e "boo;" ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:0 
> cannot recognize input near 'boo' '' '' (state=42000,code=4)
> Closing: 0: jdbc:hive2://...
> Failed!
> {code}
> But the following does not:
> {code}
> bash$ if echo "boo;"|beeline -u "jdbc:hive2://..." ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.1.0-cdh5.5.0 by Apache Hive
> 0: jdbc:hive2://...:8> Error: Error while compiling statement: FAILED: 
> ParseException line 1:0 cannot recognize input near 'boo' '' '' 
> (state=42000,code=4)
> 0: jdbc:hive2://...:8> Closing: 0: jdbc:hive2://...
> Ok?!
> {code}
> This was misleading our batch scripts to always believe that the execution of 
> the queries succeded, when sometimes that was not the case. 
> h2. Workaround
> We found we can work around the issue by always using the -e or the -f 
> parameters, and even reading the standard input through the /dev/stdin device 
> (this was useful because a lot of the scripts fed the queries from here 
> documents), like this:
> {code:title=some-script.sh}
> #!/bin/sh
> set -o nounset -o errexit -o pipefail
> # As beeline is failing to report an error status if reading the query
> # to be executed from STDIN, check whether no -f or -e option is used
> # and, in that case, pretend it has to read the query from a regular
> # file using -f to read from /dev/stdin
> function beeline_workaround_exit_status () {
> for arg in "$@"
> do if [ "$arg" = "-f" -o "$arg" = "-e" ]
>  

[jira] [Updated] (HIVE-13367) Extending HPLSQL parser

2016-03-28 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-13367:
--
Status: Patch Available  (was: Open)

> Extending HPLSQL parser
> ---
>
> Key: HIVE-13367
> URL: https://issues.apache.org/jira/browse/HIVE-13367
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Affects Versions: 2.1.0
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-13367.1.patch
>
>
> Extending HPL/SQL parser to support more procedural constructs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13367) Extending HPLSQL parser

2016-03-28 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-13367:
--
Attachment: HIVE-13367.1.patch

Patch with tests attached. 

> Extending HPLSQL parser
> ---
>
> Key: HIVE-13367
> URL: https://issues.apache.org/jira/browse/HIVE-13367
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Affects Versions: 2.1.0
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-13367.1.patch
>
>
> Extending HPL/SQL parser to support more procedural constructs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12968) genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214188#comment-15214188
 ] 

Hive QA commented on HIVE-12968:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795426/HIVE-12968.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7395/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7395/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7395/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7395/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn 
off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh 
Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn 
off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh 
Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795426 - PreCommit-HIVE-TRUNK-Build

> genNotNullFilterForJoinSourcePlan: needs to merge predicates into the 
> multi-AND
> ---
>
> Key: HIVE-12968
> URL: https://issues.apache.org/jira/browse/HIVE-12968
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Attachments: HIVE-12968.1.patch, HIVE-12968.2.patch, 
> HIVE-12968.3.patch
>
>
> {code}
> predicate: ((cbigint is not null and cint is not null) and cint BETWEEN 
> 100 AND 300) (type: boolean)
> {code}
> does not fold the IS_NULL on cint, because of the structure of the AND clause.
> For example, see {{tez_dynpart_hashjoin_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11601) confusing message in start/stop webhcat server

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214185#comment-15214185
 ] 

Hive QA commented on HIVE-11601:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795372/HIVE-11601.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7394/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7394/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7394/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7394/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn 
off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh 
Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn 
off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh 
Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795372 - PreCommit-HIVE-TRUNK-Build

> confusing message in start/stop webhcat server
> --
>
> Key: HIVE-11601
> URL: https://issues.apache.org/jira/browse/HIVE-11601
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Takashi Ohnishi
>Assignee: Andrew Sears
>Priority: Trivial
> Attachments: HIVE-11601.patch, HIVE-11601.patch
>
>
> HIVE-5167 makes webhcat_config.sh can output the below message
> {code}
> Lenght of string is non zero
> {code}
> This maybe misspelling.
> And I think it is not easy to understand what to say.
> How about changin like 
> {code}
> found HIVE_HOME is already set.
> {code}
> or remove this message?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13326) HiveServer2: Make ZK config publishing configurable

2016-03-28 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13326:

Attachment: HIVE-13326.2.patch

> HiveServer2: Make ZK config publishing configurable
> ---
>
> Key: HIVE-13326
> URL: https://issues.apache.org/jira/browse/HIVE-13326
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-13326.1.patch, HIVE-13326.2.patch
>
>
> We should revert to older behaviour when config publishing is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214055#comment-15214055
 ] 

Hive QA commented on HIVE-13149:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795360/HIVE-13149.6.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 9825 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-cbo_rp_stats.q-skewjoinopt16.q-rename_column.q-and-12-more - did 
not produce a TEST-*.xml file
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString
org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7393/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7393/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7393/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795360 - PreCommit-HIVE-TRUNK-Build

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3432) perform a map-only group by if grouping key matches the sorting properties of the table

2016-03-28 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3432:
-
Labels:   (was: TODOC10)

> perform a map-only group by if grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.10.0
>
> Attachments: hive.3432.1.patch, hive.3432.2.patch, hive.3432.3.patch, 
> hive.3432.4.patch, hive.3432.5.patch, hive.3432.6.patch, hive.3432.7.patch, 
> hive.3432.8.patch
>
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4281) add hive.map.groupby.sorted.testmode

2016-03-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213991#comment-15213991
 ] 

Lefty Leverenz commented on HIVE-4281:
--

Doc note:  Removing the TODOC11 label because 
*hive.map.groupby.sorted.testmode* is now documented in the wiki (including its 
removal by HIVE-12325):

* [Configuration Properties -- hive.map.groupby.sorted.testmode | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted.testmode]


> add hive.map.groupby.sorted.testmode
> 
>
> Key: HIVE-4281
> URL: https://issues.apache.org/jira/browse/HIVE-4281
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4281.1.patch, hive.4281.2.patch, 
> hive.4281.2.patch-nohcat, hive.4281.3.patch
>
>
> The idea behind this would be to test hive.map.groupby.sorted.
> Since this is a new feature, it might be a good idea to run it in test mode,
> where a query property would denote that this query plan would have changed.
> If a customer wants, they can run those queries offline, compare the results
> for correctness, and set hive.map.groupby.sorted only if all the results are
> the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4281) add hive.map.groupby.sorted.testmode

2016-03-28 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4281:
-
Labels:   (was: TODOC11)

> add hive.map.groupby.sorted.testmode
> 
>
> Key: HIVE-4281
> URL: https://issues.apache.org/jira/browse/HIVE-4281
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4281.1.patch, hive.4281.2.patch, 
> hive.4281.2.patch-nohcat, hive.4281.3.patch
>
>
> The idea behind this would be to test hive.map.groupby.sorted.
> Since this is a new feature, it might be a good idea to run it in test mode,
> where a query property would denote that this query plan would have changed.
> If a customer wants, they can run those queries offline, compare the results
> for correctness, and set hive.map.groupby.sorted only if all the results are
> the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12325) Turn hive.map.groupby.sorted on by default

2016-03-28 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12325:
--
Labels:   (was: TODOC2.0)

> Turn hive.map.groupby.sorted on by default
> --
>
> Key: HIVE-12325
> URL: https://issues.apache.org/jira/browse/HIVE-12325
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Chetna Chaudhari
> Fix For: 2.0.0
>
> Attachments: HIVE-12325.1.patch
>
>
> When applicable it can avoid shuffle phase altogether for group by, which 
> will be a performance win. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12325) Turn hive.map.groupby.sorted on by default

2016-03-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213988#comment-15213988
 ] 

Lefty Leverenz commented on HIVE-12325:
---

Removing the TODOC2.0 label because *hive.map.groupby.sorted* and the removal 
of *hive.map.groupby.sorted.testmode* are now documented in the wiki:

* [Configuration Properties -- hive.map.groupby.sorted | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted]
* [Configuration Properties -- hive.map.groupby.sorted.testmode | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted.testmode]


> Turn hive.map.groupby.sorted on by default
> --
>
> Key: HIVE-12325
> URL: https://issues.apache.org/jira/browse/HIVE-12325
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Chetna Chaudhari
> Fix For: 2.0.0
>
> Attachments: HIVE-12325.1.patch
>
>
> When applicable it can avoid shuffle phase altogether for group by, which 
> will be a performance win. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert

2016-03-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213949#comment-15213949
 ] 

Lefty Leverenz commented on HIVE-4240:
--

Removed the TODOC11 label because *hive.optimize.bucketingsorting* is now 
documented in the wiki:

* [Configuration Properties -- hive.optimize.bucketingsorting | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.bucketingsorting]

> optimize hive.enforce.bucketing and hive.enforce sorting insert
> ---
>
> Key: HIVE-4240
> URL: https://issues.apache.org/jira/browse/HIVE-4240
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, 
> hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat
>
>
> Consider the following scenario:
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.reducers.max = 1;
> set hive.merge.mapfiles=false;
> set hive.merge.mapredfiles=false;
> -- Create two bucketed and sorted tables
> CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> FROM src
> INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *;
> -- Insert data into the bucketed table by selecting from another bucketed 
> table
> -- This should be a map-only operation
> INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1')
> SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1';
> We should not need a reducer to perform the above operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs

2016-03-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213948#comment-15213948
 ] 

Lefty Leverenz commented on HIVE-12331:
---

Configuration Properties now includes *hive.optimize.bucketingsorting* but the 
TODOC2.0 label still remains because of other wikidocs that need to be updated.

> Remove hive.enforce.bucketing & hive.enforce.sorting configs
> 
>
> Key: HIVE-12331
> URL: https://issues.apache.org/jira/browse/HIVE-12331
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12331.1.patch, HIVE-12331.patch
>
>
> If table is created as bucketed and/or sorted and this config is set to 
> false, you will insert data in wrong buckets and/or sort order and then if 
> you use these tables subsequently in BMJ or SMBJ you will get wrong results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert

2016-03-28 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4240:
-
Labels:   (was: TODOC11)

> optimize hive.enforce.bucketing and hive.enforce sorting insert
> ---
>
> Key: HIVE-4240
> URL: https://issues.apache.org/jira/browse/HIVE-4240
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, 
> hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat
>
>
> Consider the following scenario:
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.reducers.max = 1;
> set hive.merge.mapfiles=false;
> set hive.merge.mapredfiles=false;
> -- Create two bucketed and sorted tables
> CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> FROM src
> INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *;
> -- Insert data into the bucketed table by selecting from another bucketed 
> table
> -- This should be a map-only operation
> INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1')
> SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1';
> We should not need a reducer to perform the above operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >