[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2018-09-28 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632488#comment-16632488
 ] 

Lefty Leverenz commented on HIVE-11394:
---

Thanks for the doc, [~asears], here's a link:

* [Explain -- The VECTORIZATION Clause | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain#LanguageManualExplain-TheVECTORIZATIONClause]

I removed the TODOC2.2 label.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch, 
> HIVE-11394.096.patch, HIVE-11394.097.patch, HIVE-11394.098.patch, 
> HIVE-11394.099.patch, HIVE-11394.0991.patch, HIVE-11394.0992.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> 

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2018-09-28 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11394:
--
Labels:   (was: TODOC2.2)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch, 
> HIVE-11394.096.patch, HIVE-11394.097.patch, HIVE-11394.098.patch, 
> HIVE-11394.099.patch, HIVE-11394.0991.patch, HIVE-11394.0992.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>   

[jira] [Commented] (HIVE-20348) Hive HCat does not create a proper 'client' on kerberos cluster without hive metastore

2018-08-15 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581526#comment-16581526
 ] 

Lefty Leverenz commented on HIVE-20348:
---

[~osayankin], you named me as a reviewer on RB but I'm not qualified to review 
code.  Perhaps someone else can handle it.

https://reviews.apache.org/r/68275/

> Hive HCat does not create a proper 'client' on kerberos cluster without hive 
> metastore
> --
>
> Key: HIVE-20348
> URL: https://issues.apache.org/jira/browse/HIVE-20348
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-20348.1.patch
>
>
> *STEPS TO REPRODUCE:*
> 1. Configure Hive to use embedded Metastore (do not specify 
> {{hive.metastore.uris}} in {{hive-site.xml}});
> 2. Create a database and a table in MySQL:
> {code:java}
> mysql -uroot -p123456 -e "CREATE DATABASE test;CREATE TABLE test.test (id 
> INT);INSERT INTO test.test VALUES (1),(2),(3)"
> {code}
> 3. Create a table in Hive:
> {code:java}
> hive -e "CREATE TABLE default.test (id INT)"
> {code}
> 4. Run Sqoop import command:
> {code:java}
> sqoop import --connect 'jdbc:mysql://localhost:3306/test' --username root 
> --password 123456 --table test  --hcatalog-database "default" 
> --hcatalog-table "test" --verbose -m 1
> {code}
> *ACTUAL RESULT:*
> Sqoop import command fails with an exception:
> {code:java}
> 18/08/08 01:07:09 ERROR tool.ImportTool: Encountered IOException running 
> import job: org.apache.hive.hcatalog.common.HCatException : 2001 : Error 
> setting output information. Cause : java.lang.NullPointerException
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:220)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
> at 
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:361)
> at 
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:783)
> at 
> org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
> at 
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259)
> at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:689)
> at 
> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:498)
> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:606)
> at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
> at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.security.token.Token.decodeWritable(Token.java:256)
> at 
> org.apache.hadoop.security.token.Token.decodeFromUrlString(Token.java:275)
> at 
> org.apache.hive.hcatalog.common.HCatUtil.extractThriftToken(HCatUtil.java:351)
> at 
> org.apache.hive.hcatalog.mapreduce.Security.handleSecurity(Security.java:139)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:214)
> ... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2018-06-16 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14249:
--
Labels: TODOC2.3  (was: TODOC2.2)

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC2.3
> Fix For: 2.3.0
>
> Attachments: HIVE-10459.2.patch, HIVE-14249.05.patch, 
> HIVE-14249.06.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19741) Update documentation to reflect list of reserved words

2018-05-30 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495727#comment-16495727
 ] 

Lefty Leverenz commented on HIVE-19741:
---

The keyword APPLICATION was added by HIVE-18004 in release 3.0.0.  It's a 
reserved word (since IdentifiersParser.g doesn't list it as nonreserved).

> Update documentation to reflect list of reserved words
> --
>
> Key: HIVE-19741
> URL: https://issues.apache.org/jira/browse/HIVE-19741
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Matt Burgess
>Priority: Minor
>
> The current list of non-reserved and reserved keywords is on the Hive wiki:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-reservedKeywordsandReservedKeywords
> However it does not match the list in code (see the lexer rules here):
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
> On particular example is the "application" keyword, which was discovered 
> while trying to create a table with a column named "application".
> This Jira proposes to align the documentation with the current set of 
> non-reserved and reserved keywords.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18447) JDBC: Provide a way for JDBC users to pass cookie info via connection string

2018-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375988#comment-16375988
 ] 

Lefty Leverenz commented on HIVE-18447:
---

Thanks for documenting this, [~vgumashta].  Here's a link to the doc:

* [HiveServer2 Clients -- Passing Custom HTTP Cookie Key/Value Pairs via JDBC 
Driver | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-PassingCustomHTTPCookieKey/ValuePairsviaJDBCDriver]

(By the way, this needs a fix version.)

> JDBC: Provide a way for JDBC users to pass cookie info via connection string
> 
>
> Key: HIVE-18447
> URL: https://issues.apache.org/jira/browse/HIVE-18447
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-18447.1.patch, HIVE-18447.2.patch
>
>
> Some authentication mechanisms like Single Sign On, need the ability to pass 
> a cookie to some intermediate authentication service like Knox via the JDBC 
> driver. We need to add the mechanism in Hive's JDBC driver (when used in HTTP 
> transport mode).
> Cookies can now be passed like:
> {code}
> jdbc:hive2://:/;transportMode=http;httpPath=;http.cookie.=;http.cookie.=
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18728) Secure webHCat with SSL

2018-02-20 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369866#comment-16369866
 ] 

Lefty Leverenz edited comment on HIVE-18728 at 2/20/18 9:50 AM:


I left a comment on RB pointing to the documentation in this jira's 
description.  (The config descriptions don't appear anywhere in the code – is 
that normal for WebHCat configs?)  The doc is fine, I'd just add "the" a couple 
of times in the example text.

When the patch is committed, this documentation should become new sections in 
the WebHCat configuration wiki and the configs should be listed in the 
Configuration Variables section:
 * [WebHCat Configure 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure]
 * [WebHCat Configure – Configuration Variables 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-ConfigurationVariables]


was (Author: le...@hortonworks.com):
I left a comment on RB pointing to the documentation in this jira's 
description.  (The config descriptions don't appear anywhere in the code – is 
that normal for WebHCat configs?)  The doc is fine, I'd just add "the" a couple 
of times in the example text.

When the patch is committed, this documentation should become new sections in 
the WebHCat configuration wiki and the configs should be listed in the 
Configuration Variables section:
 * [WebHCat Configure 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure]
 * [WebHCat Configure – ConfigurationVariables 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-ConfigurationVariables]

> Secure webHCat with SSL
> ---
>
> Key: HIVE-18728
> URL: https://issues.apache.org/jira/browse/HIVE-18728
> Project: Hive
>  Issue Type: New Feature
>  Components: Security
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18728.1.patch, HIVE-18728.2.patch
>
>
> Doc for the issue:
> *Configure WebHCat server to use SSL encryption*
> You can configure WebHCat REST-API to use SSL (Secure Sockets Layer) 
> encryption. The following WebHCat properties are added to enable SSL. 
> {{templeton.use.ssl}}
> Default value: {{false}}
> Description: Set this to true for using SSL encryption for  WebHCat server
> {{templeton.keystore.path}}
> Default value: {{}}
> Description: SSL certificate keystore location for WebHCat server
> {{templeton.keystore.password}}
> Default value: {{}}
> Description: SSL certificate keystore password for WebHCat server
> {{templeton.ssl.protocol.blacklist}}
> Default value: {{SSLv2,SSLv3}}
> Description: SSL Versions to disable for WebHCat server
> {{templeton.host}}
> Default value: {{0.0.0.0}}
> Description: The host address the WebHCat server will listen on.
> *Modifying the {{webhcat-site.xml}} file*
> Configure the following properties in the {{webhcat-site.xml}} file to enable 
> SSL encryption on each node where WebHCat is installed: 
> {code}
> 
> 
>   templeton.use.ssl
>   true
> 
> 
>   templeton.keystore.path
>   /path/to/ssl_keystore
> 
> 
>   templeton.keystore.password
>   password
> 
> {code}
> *Example:* To check status of WebHCat server configured for SSL encryption 
> use following command
> {code}
> curl -k 'https://:@:50111/templeton/v1/status'
> {code}
> replace {{}} and {{}} with valid user/password.  Replace 
> {{}} with your host name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18728) Secure webHCat with SSL

2018-02-20 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369866#comment-16369866
 ] 

Lefty Leverenz commented on HIVE-18728:
---

I left a comment on RB pointing to the documentation in this jira's 
description.  (The config descriptions don't appear anywhere in the code – is 
that normal for WebHCat configs?)  The doc is fine, I'd just add "the" a couple 
of times in the example text.

When the patch is committed, this documentation should become new sections in 
the WebHCat configuration wiki and the configs should be listed in the 
Configuration Variables section:
 * [WebHCat Configure 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure]
 * [WebHCat Configure – ConfigurationVariables 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-ConfigurationVariables]

> Secure webHCat with SSL
> ---
>
> Key: HIVE-18728
> URL: https://issues.apache.org/jira/browse/HIVE-18728
> Project: Hive
>  Issue Type: New Feature
>  Components: Security
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18728.1.patch, HIVE-18728.2.patch
>
>
> Doc for the issue:
> *Configure WebHCat server to use SSL encryption*
> You can configure WebHCat REST-API to use SSL (Secure Sockets Layer) 
> encryption. The following WebHCat properties are added to enable SSL. 
> {{templeton.use.ssl}}
> Default value: {{false}}
> Description: Set this to true for using SSL encryption for  WebHCat server
> {{templeton.keystore.path}}
> Default value: {{}}
> Description: SSL certificate keystore location for WebHCat server
> {{templeton.keystore.password}}
> Default value: {{}}
> Description: SSL certificate keystore password for WebHCat server
> {{templeton.ssl.protocol.blacklist}}
> Default value: {{SSLv2,SSLv3}}
> Description: SSL Versions to disable for WebHCat server
> {{templeton.host}}
> Default value: {{0.0.0.0}}
> Description: The host address the WebHCat server will listen on.
> *Modifying the {{webhcat-site.xml}} file*
> Configure the following properties in the {{webhcat-site.xml}} file to enable 
> SSL encryption on each node where WebHCat is installed: 
> {code}
> 
> 
>   templeton.use.ssl
>   true
> 
> 
>   templeton.keystore.path
>   /path/to/ssl_keystore
> 
> 
>   templeton.keystore.password
>   password
> 
> {code}
> *Example:* To check status of WebHCat server configured for SSL encryption 
> use following command
> {code}
> curl -k 'https://:@:50111/templeton/v1/status'
> {code}
> replace {{}} and {{}} with valid user/password.  Replace 
> {{}} with your host name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2018-02-20 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369835#comment-16369835
 ] 

Lefty Leverenz commented on HIVE-18341:
---

Doc note:  This adds the configuration parameter 
*hive.repl.add.raw.reserved.namespace* to HiveConf.java.  It is documented here 
(thanks, [~anishek]):

* [hive.repl.add.raw.reserved.namespace | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.repl.add.raw.reserved.namespace]

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch, 
> HIVE-18341.2.patch, HIVE-18341.3.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-8472) Add ALTER DATABASE SET LOCATION

2018-01-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340357#comment-16340357
 ] 

Lefty Leverenz commented on HIVE-8472:
--

The documentation is here (thanks [~mithun]):

* [DDL -- Alter Database | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterDatabase]

By the way, Fix Version/s should include 2.2.1.
 

 

> Add ALTER DATABASE SET LOCATION
> ---
>
> Key: HIVE-8472
> URL: https://issues.apache.org/jira/browse/HIVE-8472
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0, 3.0.0, 2.4.0
>Reporter: Jeremy Beard
>Assignee: Mithun Radhakrishnan
>Priority: Major
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-8472.1-branch-2.patch, HIVE-8472.1.patch, 
> HIVE-8472.2-branch-2.patch, HIVE-8472.3.patch
>
>
> Similarly to ALTER TABLE tablename SET LOCATION, it would be helpful if there 
> was an equivalent for databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16997) Extend object store to store and use bit vectors

2018-01-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325466#comment-16325466
 ] 

Lefty Leverenz commented on HIVE-16997:
---

*hive.stats.fetch.bitvector* is documented now:

* [Configuration Properties -- hive.stats.fetch.bitvector | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.fetch.bitvector]

[~pxiong], please review the doc and make any needed corrections, then remove 
the TODOC3.0 label.

> Extend object store to store and use bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16997.01.patch, HIVE-16997.02.patch, 
> HIVE-16997.03.patch, HIVE-16997.04.patch, HIVE-16997.05.patch, 
> HIVE-16997.06.patch
>
>
> This patch includes: (1) a new serde for FMSketch (2) change of schema for 
> derby and mysql (3) support for date type (4) refactoring the extrapolation 
> and merge code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18296) Document hive.exec.local.scratchdir

2018-01-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18296:
--
Labels:   (was: TODOC1.0 TODOC2.0)

> Document hive.exec.local.scratchdir
> ---
>
> Key: HIVE-18296
> URL: https://issues.apache.org/jira/browse/HIVE-18296
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Lefty Leverenz
>Priority: Minor
>
> Document configuration variable {{hive.exec.local.scratchdir}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-1577) Add configuration property hive.exec.local.scratchdir

2018-01-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14138568#comment-14138568
 ] 

Lefty Leverenz edited comment on HIVE-1577 at 1/14/18 6:25 AM:
---

Doc note:  *hive.exec.local.scratchdir* should also be documented in the 
Configuration Properties wikidoc, right after *hive.exec.scratchdir*:

* [Configuration Properties -- hive.exec.scratchdir | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.scratchdir]

Update 13/Jan/18:  the doc is done now in Configuration Properties, as 
requested by HIVE-18296.


was (Author: le...@hortonworks.com):
Doc note:  *hive.exec.local.scratchdir* should also be documented in the 
Configuration Properties wikidoc, right after *hive.exec.scratchdir*:

* [Configuration Properties -- hive.exec.scratchdir | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.scratchdir]

> Add configuration property hive.exec.local.scratchdir
> -
>
> Key: HIVE-1577
> URL: https://issues.apache.org/jira/browse/HIVE-1577
> Project: Hive
>  Issue Type: New Feature
>  Components: Configuration
>Reporter: Carl Steinbach
> Fix For: 0.10.0
>
>
> When Hive is run in local mode it uses the hardcoded local directory 
> {{/${java.io.tmpdir}/${user.name}}} for temporary files. This path should be
> configurable via the property {{hive.exec.local.scratchdir}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18296) Document hive.exec.local.scratchdir

2018-01-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325459#comment-16325459
 ] 

Lefty Leverenz commented on HIVE-18296:
---

Doc done, so I'm removing the TODOC labels.

* [Configuration Properties -- hive.exec.local.scratchdir | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.local.scratchdir]

By the way, *hive.exec.local.scratchdir* was previously documented in the Admin 
Manual:

* [AdminManual Configuration -- Hive Configuration Variables | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariables]

[~belugabehr] if you're satisfied with the doc, please resolve this issue as 
fixed.

> Document hive.exec.local.scratchdir
> ---
>
> Key: HIVE-18296
> URL: https://issues.apache.org/jira/browse/HIVE-18296
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Lefty Leverenz
>Priority: Minor
>  Labels: TODOC1.0, TODOC2.0
>
> Document configuration variable {{hive.exec.local.scratchdir}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18296) Document hive.exec.local.scratchdir

2018-01-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz reassigned HIVE-18296:
-

Assignee: Lefty Leverenz

> Document hive.exec.local.scratchdir
> ---
>
> Key: HIVE-18296
> URL: https://issues.apache.org/jira/browse/HIVE-18296
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Lefty Leverenz
>Priority: Minor
>  Labels: TODOC1.0, TODOC2.0
>
> Document configuration variable {{hive.exec.local.scratchdir}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2018-01-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325452#comment-16325452
 ] 

Lefty Leverenz commented on HIVE-18149:
---

Thanks Zoltan, I tweaked the doc to show the old default as well as the new.

> Stats: rownum estimation from datasize underestimates in most cases
> ---
>
> Key: HIVE-18149
> URL: https://issues.apache.org/jira/browse/HIVE-18149
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-18149.01.patch, HIVE-18149.01wip01.patch, 
> HIVE-18149.02.patch, HIVE-18149.03.patch, HIVE-18149.03wip01.patch, 
> HIVE-18149.03wip02.patch
>
>
> rownum estimation is based on the following fact as of now:
> * datasize being used from the following sources:
> ** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
> able to give "raw size" estimation - I've checked orc; but I'm sure others 
> will do the sameapi docs are a bit vague about the methods purpose...
> ** if the basicstats level info is not available; the filesystem level 
> "file-size-sums" are used as the "raw data size" ; which is multiplied by the 
> [deserialization 
> ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
>  ; which is currently 1.
> the problem with all of this is that deser factor is 1; and that rowsize 
> counts in the online object headers..
> example; 20 rows are loaded into a partition 
> [columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]
> after HIVE-18108 [this 
> explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
>  will estimate the rowsize of the table to be 404 bytes; however the 20 rows 
> of text is only 169 bytes...so it ends up with 0 rows...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18003) add explicit jdbc connection string args for mappings

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307707#comment-16307707
 ] 

Lefty Leverenz commented on HIVE-18003:
---

Doc note:  This adds *hive.server2.wm.allow.any.pool.via.jdbc* to HiveConf.java 
and changes the name of *hive.server2.tez.wm.worker.threads* (created by 
HIVE-17841 for the same release) to *hive.server2.wm.worker.threads*.

They both belong in the HiveServer2 section of Configuration Properties, and 
*hive.server2.wm.worker.threads* should also be listed at the beginning of the 
Tez section.

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]
* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)


> add explicit jdbc connection string args for mappings
> -
>
> Key: HIVE-18003
> URL: https://issues.apache.org/jira/browse/HIVE-18003
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18003.01.patch, HIVE-18003.02.patch, 
> HIVE-18003.03.patch, HIVE-18003.04.patch, HIVE-18003.05.patch, 
> HIVE-18003.patch
>
>
> 1) Force using unmanaged/containers execution.
> 2) Optional - specify pool name (config setting to gate this, disabled by 
> default?).
> In phase 2 (or 4?) we might allow #2 to be used by a user to choose between 
> multiple mappings if they have multiple pools they could be mapped to (i.e. 
> to change the ordering essentially). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip compression for HTTP mode

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307703#comment-16307703
 ] 

Lefty Leverenz commented on HIVE-17194:
---

Doc note:  This adds *hive.server2.thrift.http.compression.enabled* to 
HiveConf.java, so it should be documented in the Configuration Properties 
wikidoc.  Gzip compression for HTTP mode could be documented in the Compression 
wikidoc.

* [Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties]
* [Compression | 
https://cwiki.apache.org/confluence/display/Hive/CompressedStorage]

Thanks for adding the TODOC3.0 label, [~gopalv].

> JDBC: Implement Gzip compression for HTTP mode
> --
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.
> After patch, result looks like
> {code}
> HTTP/1.1 200 OK
> Date: Tue, 01 Aug 2017 01:47:23 GMT
> Content-Type: application/x-thrift
> Vary: Accept-Encoding, User-Agent
> Content-Encoding: gzip
> Transfer-Encoding: chunked
> Server: Jetty(9.3.8.v20160314)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16143) Improve msck repair batching

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307698#comment-16307698
 ] 

Lefty Leverenz commented on HIVE-16143:
---

Doc note:  This adds *hive.msck.repair.batch.max.retries* and revises the 
description of *hive.msck.repair.batch.size* in release 2.4.0, so the wiki 
needs to be updated.

* [Configuration Properties -- hive.msck.repair.batch.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.msck.repair.batch.size]
* [Configuration Properties -- hive.msck.repair.batch.max.retries | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.msck.repair.batch.max.retries]
 (this link won't work until the parameter is documented)

Added a TODOC2.4 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.4
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch, 
> HIVE-16143.06.patch, HIVE-16143.07.patch, HIVE-16143.08.patch, 
> HIVE-16143.09.patch, HIVE-16143.10-branch-2.patch, 
> HIVE-16143.10-branch-2.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16143) Improve msck repair batching

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16143:
--
Labels: TODOC2.4  (was: )

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.4
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch, 
> HIVE-16143.06.patch, HIVE-16143.07.patch, HIVE-16143.08.patch, 
> HIVE-16143.09.patch, HIVE-16143.10-branch-2.patch, 
> HIVE-16143.10-branch-2.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17690) Add distcp.options.p* in sql standard authorization config whitelist

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307682#comment-16307682
 ] 

Lefty Leverenz commented on HIVE-17690:
---

No doc needed.  This changes a value of 
*hive.security.authorization.sqlstd.confwhitelist* that was added by HIVE-17571 
in release 3.0.0.

> Add distcp.options.p* in sql standard authorization config whitelist
> 
>
> Key: HIVE-17690
> URL: https://issues.apache.org/jira/browse/HIVE-17690
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 3.0.0
>
> Attachments: HIVE-17690.1.patch
>
>
> distcp arguments for "-p" are specified right after the "-p" without space. 
> eg "-px"
> Whitelist needs to be modified to allow this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17652) retire ANALYZE TABLE ... PARTIALSCAN

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307610#comment-16307610
 ] 

Lefty Leverenz edited comment on HIVE-17652 at 1/2/18 2:46 AM:
---

Doc note:  This removes the syntax ALTER TABLE ... COMPUTE STATISTICS 
PARTIALSCAN, which was introduced in release 0.11.0 by HIVE-3958.  The syntax 
still needs to be documented in the Statistics wikidoc, and the removal of 
reserved keyword PARTIALSCAN needs to be listed for release 3.0.0 in the DDL 
doc.  Also, the revised description of configuration property 
*hive.stats.gather.num.threads* needs to be documented with release information.

* [Statistics -- ExistingTables–ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE]
* [DDL -- Reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ReservedKeywords]
* [Configuration Properties -- hive.stats.gather.num.threads | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.gather.num.threads]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)


was (Author: le...@hortonworks.com):
Doc note:  This removes the syntax ALTER TABLE ... COMPUTE STATISTICS 
PARTIALSCAN, which was introduced in release 0.11.0 by HIVE-3958.  The syntax 
still needs to be documented in the Statistics wikidoc, and the removal of 
reserved keyword PARTIALSCAN needs to be listed for release 3.0.0 in the DDL 
doc.

* [Statistics -- ExistingTables–ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE]
* [DDL -- Reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ReservedKeywords]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)

> retire ANALYZE TABLE ... PARTIALSCAN
> 
>
> Key: HIVE-17652
> URL: https://issues.apache.org/jira/browse/HIVE-17652
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17652.01.patch
>
>
> * I think its only usable for RCFiles 
> * a bit sophisticated
> * probably a reader which implements {{StatsProvidingRecordReader}} would be 
> more desirable
> * seems to be somewhat undocumented - and possibly unused



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17652) retire ANALYZE TABLE ... PARTIALSCAN

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307610#comment-16307610
 ] 

Lefty Leverenz commented on HIVE-17652:
---

Doc note:  This removes the syntax ALTER TABLE ... COMPUTE STATISTICS 
PARTIALSCAN, which was introduced in release 0.11.0 by HIVE-3958.  The syntax 
still needs to be documented in the Statistics wikidoc, and the removal of 
reserved keyword PARTIALSCAN needs to be listed for release 3.0.0 in the DDL 
doc.

* [Statistics -- ExistingTables–ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE]
* [DDL -- Reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ReservedKeywords]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)

> retire ANALYZE TABLE ... PARTIALSCAN
> 
>
> Key: HIVE-17652
> URL: https://issues.apache.org/jira/browse/HIVE-17652
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17652.01.patch
>
>
> * I think its only usable for RCFiles 
> * a bit sophisticated
> * probably a reader which implements {{StatsProvidingRecordReader}} would be 
> more desirable
> * seems to be somewhat undocumented - and possibly unused



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17652) retire ANALYZE TABLE ... PARTIALSCAN

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17652:
--
Labels: TODOC3.0  (was: )

> retire ANALYZE TABLE ... PARTIALSCAN
> 
>
> Key: HIVE-17652
> URL: https://issues.apache.org/jira/browse/HIVE-17652
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17652.01.patch
>
>
> * I think its only usable for RCFiles 
> * a bit sophisticated
> * probably a reader which implements {{StatsProvidingRecordReader}} would be 
> more desirable
> * seems to be somewhat undocumented - and possibly unused



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-3958) support partial scan for analyze command - RCFile

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307607#comment-16307607
 ] 

Lefty Leverenz commented on HIVE-3958:
--

Doc note:  This adds the syntax ANALYZE TABLE ... COMPUTE STATISTICS 
PARTIALSCAN, which needs to be documented in the wiki.  The reserved keyword 
PARTIALSCAN is already documented.

* [Statistics -- ExistingTables–ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE]
* [DDL -- Reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ReservedKeywords]

HIVE-17652 removes this syntax in release 3.0.0.

Added a TODOC11 label.

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>  Labels: TODOC11
> Fix For: 0.11.0
>
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5, HIVE-3958.patch.6
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase (Edit:  That link should be 
> https://cwiki.apache.org/confluence/display/Hive/RCFileCat.)
> This jira is targeted to address the #2. More specifically RCFile format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3958:
-
Labels: TODOC11  (was: )

> support partial scan for analyze command - RCFile
> -
>
> Key: HIVE-3958
> URL: https://issues.apache.org/jira/browse/HIVE-3958
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>  Labels: TODOC11
> Fix For: 0.11.0
>
> Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, 
> HIVE-3958.patch.4, HIVE-3958.patch.5, HIVE-3958.patch.6
>
>
> analyze commands allows us to collect statistics on existing 
> tables/partitions. It works great but might be slow since it scans all files.
> There are 2 ways to speed it up:
> 1. collect stats without file scan. It may not collect all stats but good and 
> fast enough for use case. HIVE-3917 addresses it
> 2. collect stats via partial file scan. It doesn't scan all content of files 
> but part of it to get file metadata. some examples are 
> https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
> and HFile of Hbase (Edit:  That link should be 
> https://cwiki.apache.org/confluence/display/Hive/RCFileCat.)
> This jira is targeted to address the #2. More specifically RCFile format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17540) remove feature: describe pretty

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307589#comment-16307589
 ] 

Lefty Leverenz commented on HIVE-17540:
---

Doc note:  This removes the non-reserved keyword PRETTY so the DDL doc needs to 
be updated for release 3.0.0.  It would also be nice to document the removed 
syntax (with release information) in the Describe section of the DDL doc.

* [DDL -- Non-reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Non-reservedKeywords]
* [DDL -- Describe | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes from 
now on.  I no longer monitor Hive email for doc issues.)

> remove feature: describe pretty
> ---
>
> Key: HIVE-17540
> URL: https://issues.apache.org/jira/browse/HIVE-17540
> Project: Hive
>  Issue Type: Wish
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17540.01.patch
>
>
> I've bumped into "describe pretty" feature...which was probably a usefull 
> thing when it was introducedbut I haven't seen it documented anywhere ... 
> google 
> [results|https://www.google.hu/search?num=50=%22describe+pretty%22+hive=%22describe+pretty%22+hive_l=psy-ab.3..35i39k1.4651.6121.0.6290.3.3.0.0.0.0.96.271.3.3.00...1.1.64.psy-ab..0.3.2700.NzSfMzfEfs8]
>  are only about the implementation of the feature.
> there are qtest about it:
> https://github.com/apache/hive/blob/88ca553c451d8d23778a912ec26b262eda402c68/ql/src/test/queries/clientpositive/describe_pretty.q#L35
> https://github.com/apache/hive/blob/88ca553c451d8d23778a912ec26b262eda402c68/ql/src/test/results/clientpositive/describe_pretty.q.out#L43
> this feature makes it possible to render column comments more nicely...
> I think beeline is already able to address formatting issues like this at a 
> much wider scale; so it might be ok to remove the 'describe pretty' 
> support...or at least I think it would be better to put efforts into 
> displaying things on beeline more nicely



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17540) remove feature: describe pretty

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17540:
--
Labels: TODOC3.0  (was: )

> remove feature: describe pretty
> ---
>
> Key: HIVE-17540
> URL: https://issues.apache.org/jira/browse/HIVE-17540
> Project: Hive
>  Issue Type: Wish
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17540.01.patch
>
>
> I've bumped into "describe pretty" feature...which was probably a usefull 
> thing when it was introducedbut I haven't seen it documented anywhere ... 
> google 
> [results|https://www.google.hu/search?num=50=%22describe+pretty%22+hive=%22describe+pretty%22+hive_l=psy-ab.3..35i39k1.4651.6121.0.6290.3.3.0.0.0.0.96.271.3.3.00...1.1.64.psy-ab..0.3.2700.NzSfMzfEfs8]
>  are only about the implementation of the feature.
> there are qtest about it:
> https://github.com/apache/hive/blob/88ca553c451d8d23778a912ec26b262eda402c68/ql/src/test/queries/clientpositive/describe_pretty.q#L35
> https://github.com/apache/hive/blob/88ca553c451d8d23778a912ec26b262eda402c68/ql/src/test/results/clientpositive/describe_pretty.q.out#L43
> this feature makes it possible to render column comments more nicely...
> I think beeline is already able to address formatting issues like this at a 
> much wider scale; so it might be ok to remove the 'describe pretty' 
> support...or at least I think it would be better to put efforts into 
> displaying things on beeline more nicely



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17841) implement applying the resource plan

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17841:
--
Labels:   (was: TODOC3.0)

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.07.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18003) add explicit jdbc connection string args for mappings

2018-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18003:
--
Labels: TODOC3.0  (was: )

> add explicit jdbc connection string args for mappings
> -
>
> Key: HIVE-18003
> URL: https://issues.apache.org/jira/browse/HIVE-18003
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18003.01.patch, HIVE-18003.02.patch, 
> HIVE-18003.03.patch, HIVE-18003.04.patch, HIVE-18003.05.patch, 
> HIVE-18003.patch
>
>
> 1) Force using unmanaged/containers execution.
> 2) Optional - specify pool name (config setting to gate this, disabled by 
> default?).
> In phase 2 (or 4?) we might allow #2 to be used by a user to choose between 
> multiple mappings if they have multiple pools they could be mapped to (i.e. 
> to change the ordering essentially). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17841) implement applying the resource plan

2018-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239405#comment-16239405
 ] 

Lefty Leverenz edited comment on HIVE-17841 at 1/2/18 12:56 AM:


Doc note:  This adds *hive.server2.tez.wm.worker.threads* to HiveConf.java, so 
it needs to be documented in the wiki.

It belongs in the HiveServer2 section of Configuration Properties, but should 
also be listed at the beginning of the Tez section.

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]
* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]

Added a TODOC3.0 label.

Update 01/01/18:  HIVE-18003 renames this parameter 
*hive.server2.wm.worker.threads* with the same default value and description, 
also for release 3.0.0.  (Removing the TODOC3.0 label.)


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.server2.tez.wm.worker.threads* to HiveConf.java, so 
it needs to be documented in the wiki.

It belongs in the HiveServer2 section of Configuration Properties, but should 
also be listed at the beginning of the Tez section.

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]
* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]

Added a TODOC3.0 label.

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.07.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13566) Auto-gather column stats - phase 1

2017-12-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307388#comment-16307388
 ] 

Lefty Leverenz commented on HIVE-13566:
---

Good docs, thanks Zoltan.

> Auto-gather column stats - phase 1
> --
>
> Key: HIVE-13566
> URL: https://issues.apache.org/jira/browse/HIVE-13566
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13566.01.patch, HIVE-13566.02.patch, 
> HIVE-13566.03.patch
>
>
> This jira adds code and tests for auto-gather column stats. Golden file 
> update will be done in phase 2 - HIVE-11160



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18230) create plan like plan, and replace plan commands for easy modification

2017-12-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307379#comment-16307379
 ] 

Lefty Leverenz commented on HIVE-18230:
---

Doc note:  This needs to be documented in the Explain wikidoc, and perhaps also 
with the new workload management feature.  ACTIVE needs to be listed in the DDL 
doc as a non-reserved keyword, along with WORKLOAD and MANAGEMENT (created by 
HIVE-18203).

* [Explain Plan | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain]
* [DDL --  Non-reserved Keywords| 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Non-reservedKeywords]

Added a TODOC3.0 label. (Please add your own TODOC labels and doc notes in the 
future.)

> create plan like plan, and replace plan commands for easy modification
> --
>
> Key: HIVE-18230
> URL: https://issues.apache.org/jira/browse/HIVE-18230
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18230.01.patch, HIVE-18230.only.nogen.patch, 
> HIVE-18230.patch
>
>
> Given that the plan already on the cluster cannot be altered, it would be 
> helpful to have create plan like plan, and replace plan commands that would 
> make a copy to be modified, and then rename+apply the copy in place of an 
> existing plan, and rename the existing active plan with a versioned name or 
> drop it altogether.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18230) create plan like plan, and replace plan commands for easy modification

2017-12-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18230:
--
Labels: TODOC3.0  (was: )

> create plan like plan, and replace plan commands for easy modification
> --
>
> Key: HIVE-18230
> URL: https://issues.apache.org/jira/browse/HIVE-18230
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18230.01.patch, HIVE-18230.only.nogen.patch, 
> HIVE-18230.patch
>
>
> Given that the plan already on the cluster cannot be altered, it would be 
> helpful to have create plan like plan, and replace plan commands that would 
> make a copy to be modified, and then rename+apply the copy in place of an 
> existing plan, and rename the existing active plan with a versioned name or 
> drop it altogether.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18203) change the way WM is enabled and allow dropping the last resource plan

2017-12-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307377#comment-16307377
 ] 

Lefty Leverenz commented on HIVE-18203:
---

Doc note:  This needs to be documented with the rest of the workload management 
feature, including the new keywords WORKLOAD and MANAGEMENT which should be 
added in the DDL doc.  They are reserved keywords in this patch but HIVE-18230 
makes them non-reserved.

* [DDL -- Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-reservedKeywordsandReservedKeywords]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes in the 
future.)

> change the way WM is enabled and allow dropping the last resource plan
> --
>
> Key: HIVE-18203
> URL: https://issues.apache.org/jira/browse/HIVE-18203
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18203.01.patch, HIVE-18203.02.patch, 
> HIVE-18203.03.patch, HIVE-18203.04.patch, HIVE-18203.patch
>
>
> Currently it's impossible to drop the last active resource plan even if WM is 
> disabled. It should be possible to deactivate the last resource plan AND 
> disable WM in the same action. Activating a resource plan should enable WM in 
> this case.
> This should interact with the WM queue config in a sensible manner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18203) change the way WM is enabled and allow dropping the last resource plan

2017-12-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18203:
--
Labels: TODOC3.0  (was: )

> change the way WM is enabled and allow dropping the last resource plan
> --
>
> Key: HIVE-18203
> URL: https://issues.apache.org/jira/browse/HIVE-18203
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18203.01.patch, HIVE-18203.02.patch, 
> HIVE-18203.03.patch, HIVE-18203.04.patch, HIVE-18203.patch
>
>
> Currently it's impossible to drop the last active resource plan even if WM is 
> disabled. It should be possible to deactivate the last resource plan AND 
> disable WM in the same action. Activating a resource plan should enable WM in 
> this case.
> This should interact with the WM queue config in a sensible manner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18224) Introduce interface above driver

2017-12-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307375#comment-16307375
 ] 

Lefty Leverenz commented on HIVE-18224:
---

Does this need to be documented in the wiki?

> Introduce interface above driver
> 
>
> Key: HIVE-18224
> URL: https://issues.apache.org/jira/browse/HIVE-18224
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-18224.01.patch, HIVE-18224.02.patch, 
> HIVE-18224.03.patch, HIVE-18224.04.patch
>
>
> Add an interface above driver; and use it outside of ql.
> The goal is to enable the overlaying of the Driver with some strategy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18095) add a unmanaged flag to triggers (applies to container based sessions)

2017-12-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307372#comment-16307372
 ] 

Lefty Leverenz commented on HIVE-18095:
---

Doc note:  This adds the non-reserved keyword UNMANAGED, which needs to be 
documented in the wiki.

* [Non-reserved Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Non-reservedKeywords]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes in the 
future.)

> add a unmanaged flag to triggers (applies to container based sessions)
> --
>
> Key: HIVE-18095
> URL: https://issues.apache.org/jira/browse/HIVE-18095
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18095.01.patch, HIVE-18095.02.patch, 
> HIVE-18095.nogen.patch, HIVE-18095.patch
>
>
> cc [~prasanth_j]
> It should be impossible to attach global triggers for pools. Setting global 
> flag should probably automatically remove attachments to pools.
> Global triggers would only support actions that Tez supports (for simplicity; 
> also, for now, move doesn't make a lot of sense because the trigger would 
> apply again after the move).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18095) add a unmanaged flag to triggers (applies to container based sessions)

2017-12-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18095:
--
Labels: TODOC3.0  (was: )

> add a unmanaged flag to triggers (applies to container based sessions)
> --
>
> Key: HIVE-18095
> URL: https://issues.apache.org/jira/browse/HIVE-18095
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18095.01.patch, HIVE-18095.02.patch, 
> HIVE-18095.nogen.patch, HIVE-18095.patch
>
>
> cc [~prasanth_j]
> It should be impossible to attach global triggers for pools. Setting global 
> flag should probably automatically remove attachments to pools.
> Global triggers would only support actions that Tez supports (for simplicity; 
> also, for now, move doesn't make a lot of sense because the trigger would 
> apply again after the move).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2017-12-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307135#comment-16307135
 ] 

Lefty Leverenz commented on HIVE-18149:
---

Doc note:  This changes the default value of 
*hive.stats.deserialization.factor* from 1.0 to 10.0, so the wiki needs to be 
updated.

* [hive.stats.deserialization.factor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.deserialization.factor]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes in the 
future.)

> Stats: rownum estimation from datasize underestimates in most cases
> ---
>
> Key: HIVE-18149
> URL: https://issues.apache.org/jira/browse/HIVE-18149
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18149.01.patch, HIVE-18149.01wip01.patch, 
> HIVE-18149.02.patch, HIVE-18149.03.patch, HIVE-18149.03wip01.patch, 
> HIVE-18149.03wip02.patch
>
>
> rownum estimation is based on the following fact as of now:
> * datasize being used from the following sources:
> ** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
> able to give "raw size" estimation - I've checked orc; but I'm sure others 
> will do the sameapi docs are a bit vague about the methods purpose...
> ** if the basicstats level info is not available; the filesystem level 
> "file-size-sums" are used as the "raw data size" ; which is multiplied by the 
> [deserialization 
> ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
>  ; which is currently 1.
> the problem with all of this is that deser factor is 1; and that rowsize 
> counts in the online object headers..
> example; 20 rows are loaded into a partition 
> [columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]
> after HIVE-18108 [this 
> explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
>  will estimate the rowsize of the table to be 404 bytes; however the 20 rows 
> of text is only 169 bytes...so it ends up with 0 rows...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2017-12-30 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18149:
--
Labels: TODOC3.0  (was: )

> Stats: rownum estimation from datasize underestimates in most cases
> ---
>
> Key: HIVE-18149
> URL: https://issues.apache.org/jira/browse/HIVE-18149
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18149.01.patch, HIVE-18149.01wip01.patch, 
> HIVE-18149.02.patch, HIVE-18149.03.patch, HIVE-18149.03wip01.patch, 
> HIVE-18149.03wip02.patch
>
>
> rownum estimation is based on the following fact as of now:
> * datasize being used from the following sources:
> ** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
> able to give "raw size" estimation - I've checked orc; but I'm sure others 
> will do the sameapi docs are a bit vague about the methods purpose...
> ** if the basicstats level info is not available; the filesystem level 
> "file-size-sums" are used as the "raw data size" ; which is multiplied by the 
> [deserialization 
> ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
>  ; which is currently 1.
> the problem with all of this is that deser factor is 1; and that rowsize 
> counts in the online object headers..
> example; 20 rows are loaded into a partition 
> [columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]
> after HIVE-18108 [this 
> explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
>  will estimate the rowsize of the table to be 404 bytes; however the 20 rows 
> of text is only 169 bytes...so it ends up with 0 rows...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18248) Clean up parameters

2017-12-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307130#comment-16307130
 ] 

Lefty Leverenz commented on HIVE-18248:
---

Doc note:  hadoop.bin.path and yarn.bin.path need to be added to the list of 
default values for *hive.conf.restricted.list* in the wiki.

* [Configuration Properties -- hive.conf.restricted.list | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.conf.restricted.list]

Added a TODOC3.0 label.  (Please add your own TODOC labels and doc notes in the 
future.)

> Clean up parameters
> ---
>
> Key: HIVE-18248
> URL: https://issues.apache.org/jira/browse/HIVE-18248
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18248.1.patch, HIVE-18248.2.patch, 
> HIVE-18248.3.patch
>
>
> Clean up of parameters that need not change at run time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18248) Clean up parameters

2017-12-30 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18248:
--
Labels: TODOC3.0  (was: )

> Clean up parameters
> ---
>
> Key: HIVE-18248
> URL: https://issues.apache.org/jira/browse/HIVE-18248
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18248.1.patch, HIVE-18248.2.patch, 
> HIVE-18248.3.patch
>
>
> Clean up of parameters that need not change at run time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13567) Enable auto-gather column stats by default

2017-12-12 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287328#comment-16287328
 ] 

Lefty Leverenz commented on HIVE-13567:
---

Doc note:  This changes the default value of *hive.stats.column.autogather* to 
true in release 3.0.0.  It was introduced in release 2.1.0 by HIVE-13566 and 
isn't documented in the wiki yet.

* [Configuration Properties -- Statistics | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Statistics]
** [Configuration Properties -- hive.stats.column.autogather | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.column.autogather]
 (this link won't work until the documentation is done)

Added a TODOC3.0 label.

(Please add your own TODOC labels and doc notes in the future.)

> Enable auto-gather column stats by default
> --
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, 
> HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, 
> HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, 
> HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, 
> HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, 
> HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch, 
> HIVE-13567.18.patch, HIVE-13567.19.patch, HIVE-13567.20.patch, 
> HIVE-13567.21.patch, HIVE-13567.22.patch, HIVE-13567.23.patch, 
> HIVE-13567.23wip01.patch, HIVE-13567.23wip02.patch, HIVE-13567.23wip03.patch, 
> HIVE-13567.23wip04.patch, HIVE-13567.23wip05.patch, HIVE-13567.23wip06.patch, 
> HIVE-13567.23wip07.patch, HIVE-13567.23wip08.patch, HIVE-13567.23wip09.patch, 
> HIVE-13567.23wip10.patch, HIVE-13567.24.patch
>
>
> in phase 2, we are going to set auto-gather column on as default. This needs 
> to update golden files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13566) Auto-gather column stats - phase 1

2017-12-12 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287321#comment-16287321
 ] 

Lefty Leverenz commented on HIVE-13566:
---

Doc note:  This adds *hive.stats.column.autogather* and changes the description 
of *hive.stats.autogather* in HiveConf.java for release 2.1.0, so the wiki 
needs to be updated.

* [Configuration Properties -- Statistics | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Statistics]
** [Configuration Properties -- hive.stats.autogather | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.autogather]
** [Configuration Properties -- hive.stats.column.autogather | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.column.autogather]
 (this link won't work until the documentation is done)

Added a TODOC2.1 label.

Update:  HIVE-13567 changes the default value of *hive.stats.column.autogather* 
to true in release 3.0.0.

> Auto-gather column stats - phase 1
> --
>
> Key: HIVE-13566
> URL: https://issues.apache.org/jira/browse/HIVE-13566
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13566.01.patch, HIVE-13566.02.patch, 
> HIVE-13566.03.patch
>
>
> This jira adds code and tests for auto-gather column stats. Golden file 
> update will be done in phase 2 - HIVE-11160



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-13567) Enable auto-gather column stats by default

2017-12-12 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-13567:
--
Labels: TODOC3.0  (was: )

> Enable auto-gather column stats by default
> --
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, 
> HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, 
> HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, 
> HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, 
> HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, 
> HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch, 
> HIVE-13567.18.patch, HIVE-13567.19.patch, HIVE-13567.20.patch, 
> HIVE-13567.21.patch, HIVE-13567.22.patch, HIVE-13567.23.patch, 
> HIVE-13567.23wip01.patch, HIVE-13567.23wip02.patch, HIVE-13567.23wip03.patch, 
> HIVE-13567.23wip04.patch, HIVE-13567.23wip05.patch, HIVE-13567.23wip06.patch, 
> HIVE-13567.23wip07.patch, HIVE-13567.23wip08.patch, HIVE-13567.23wip09.patch, 
> HIVE-13567.23wip10.patch, HIVE-13567.24.patch
>
>
> in phase 2, we are going to set auto-gather column on as default. This needs 
> to update golden files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18251) Loosen restriction for some checks

2017-12-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287053#comment-16287053
 ] 

Lefty Leverenz commented on HIVE-18251:
---

Doc note:  This changes the default value of 
*hive.strict.checks.cartesian.product* to false.  It isn't documented in the 
wiki yet -- see HIVE-12727.

Added a TODOC3.0 label.

(Please add your own TODOC labels and doc notes in the future.)

> Loosen restriction for some checks
> --
>
> Key: HIVE-18251
> URL: https://issues.apache.org/jira/browse/HIVE-18251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18251.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18251) Loosen restriction for some checks

2017-12-11 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18251:
--
Labels: TODOC3.0  (was: )

> Loosen restriction for some checks
> --
>
> Key: HIVE-18251
> URL: https://issues.apache.org/jira/browse/HIVE-18251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18251.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18196) Druid Mini Cluster to run Qtests integrations tests.

2017-12-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287082#comment-16287082
 ] 

Lefty Leverenz commented on HIVE-18196:
---

Doc note:  This adds "derby" to the possible values for 
*hive.druid.metadata.db.type*, which was introduced in release 2.2.0 by 
HIVE-15277 and is not documented in the wiki yet.

Added a TODOC3.0 label.

(Please add your own TODOC labels and doc notes in the future.)

> Druid Mini Cluster to run Qtests integrations tests.
> 
>
> Key: HIVE-18196
> URL: https://issues.apache.org/jira/browse/HIVE-18196
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18196.10.patch, HIVE-18196.11.patch, 
> HIVE-18196.12.patch, HIVE-18196.2.patch, HIVE-18196.3.patch, 
> HIVE-18196.4.patch, HIVE-18196.5.patch, HIVE-18196.6.patch, 
> HIVE-18196.7.patch, HIVE-18196.8.patch, HIVE-18196.patch
>
>
> The overall Goal of this is to add a new Module that can fork a druid cluster 
> to run integration testing as part of the Mini Clusters Qtest suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-15277) Teach Hive how to create/delete Druid segments

2017-12-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753038#comment-15753038
 ] 

Lefty Leverenz edited comment on HIVE-15277 at 12/12/17 4:14 AM:
-

The new table property should be documented here as well as in the Druid 
Integration doc:

* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

Also document the new configuration parameters:

*  *hive.druid.indexer.segments.granularity*
*  *hive.druid.indexer.partition.size.max*
*  *hive.druid.indexer.memory.rownum.max*
*  *hive.druid.basePersistDirectory*
*  *hive.druid.storage.storageDirectory*
*  *hive.druid.metadata.base*
*  *hive.druid.metadata.db.type*  (Edit:  see HIVE-15809 for correct values)
 (Edit 2:  see HIVE-18196 for new value in 3.0.0)
*  *hive.druid.metadata.username*
*  *hive.druid.metadata.password*
*  *hive.druid.metadata.uri*
*  *hive.druid.working.directory*

At this point there are enough Druid configuration parameters for a separate 
subsection in the Configuration Properties doc.  (Also see HIVE-14217 and 
HIVE-15273.)

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties]

Added a TODOC2.2 label.


was (Author: le...@hortonworks.com):
The new table property should be documented here as well as in the Druid 
Integration doc:

* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

Also document the new configuration parameters:

*  *hive.druid.indexer.segments.granularity*
*  *hive.druid.indexer.partition.size.max*
*  *hive.druid.indexer.memory.rownum.max*
*  *hive.druid.basePersistDirectory*
*  *hive.druid.storage.storageDirectory*
*  *hive.druid.metadata.base*
*  *hive.druid.metadata.db.type*  (Edit:  see HIVE-15809 for correct values)
*  *hive.druid.metadata.username*
*  *hive.druid.metadata.password*
*  *hive.druid.metadata.uri*
*  *hive.druid.working.directory*

At this point there are enough Druid configuration parameters for a separate 
subsection in the Configuration Properties doc.  (Also see HIVE-14217 and 
HIVE-15273.)

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties]

Added a TODOC2.2 label.

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18196) Druid Mini Cluster to run Qtests integrations tests.

2017-12-11 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18196:
--
Labels: TODOC3.0  (was: )

> Druid Mini Cluster to run Qtests integrations tests.
> 
>
> Key: HIVE-18196
> URL: https://issues.apache.org/jira/browse/HIVE-18196
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18196.10.patch, HIVE-18196.11.patch, 
> HIVE-18196.12.patch, HIVE-18196.2.patch, HIVE-18196.3.patch, 
> HIVE-18196.4.patch, HIVE-18196.5.patch, HIVE-18196.6.patch, 
> HIVE-18196.7.patch, HIVE-18196.8.patch, HIVE-18196.patch
>
>
> The overall Goal of this is to add a new Module that can fork a druid cluster 
> to run integration testing as part of the Mini Clusters Qtest suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-12727) refactor Hive strict checks to be more granular, allow order by no limit and no partition filter by default for now

2017-12-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124535#comment-15124535
 ] 

Lefty Leverenz edited comment on HIVE-12727 at 12/12/17 3:44 AM:
-

Doc note:  This deprecates *hive.mapred.mode* in 2.0.0, changing its default 
value back to nonstrict after HIVE-12413 changed it to strict in the same 
release, and adds three new configuration parameters to replace 
*hive.mapred.mode* (*hive.strict.checks.large.query* with default false, 
*hive.strict.checks.type.safety* with default true, and 
*hive.strict.checks.cartesian.product* with default true), so I added a 
TODOC2.0 label.

The parameter changes should be documented in the wiki here:

* [Configuration Properties -- hive.mapred.mode | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapred.mode]

Edit 11/Dec/17:  See doc updates in comment 22/Nov/16.


was (Author: le...@hortonworks.com):
Doc note:  This deprecates *hive.mapred.mode* in 2.0.0, changing its default 
value back to nonstrict after HIVE-12413 changed it to strict in the same 
release, and adds three new configuration parameters to replace 
*hive.mapred.mode* (*hive.strict.checks.large.query* with default false, 
*hive.strict.checks.type.safety* with default true, and 
*hive.strict.checks.cartesian.product* with default true), so I added a 
TODOC2.0 label.

The parameter changes should be documented in the wiki here:

* [Configuration Properties -- hive.mapred.mode | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapred.mode]

> refactor Hive strict checks to be more granular, allow order by no limit and 
> no partition filter by default for now
> ---
>
> Key: HIVE-12727
> URL: https://issues.apache.org/jira/browse/HIVE-12727
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12727.01.patch, HIVE-12727.02.patch, 
> HIVE-12727.03.patch, HIVE-12727.04.patch, HIVE-12727.05.patch, 
> HIVE-12727.06.patch, HIVE-12727.07.patch, HIVE-12727.patch
>
>
> Making strict mode the default recently appears to have broken many normal 
> queries, such as some TPCDS benchmark queries, e.g. Q85:
> Response message: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: SemanticException [Error 10041]: No partition 
> predicate found for Alias "web_sales" Table "web_returns"
> We should remove this restriction from strict mode, or change the default 
> back to non-strict. Perhaps make a 3-value parameter, nonstrict, semistrict, 
> and strict, for backward compat for people who are relying on strict already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-12727) refactor Hive strict checks to be more granular, allow order by no limit and no partition filter by default for now

2017-12-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15686092#comment-15686092
 ] 

Lefty Leverenz edited comment on HIVE-12727 at 12/12/17 3:41 AM:
-

HIVE-15148 changes the description of *hive.strict.checks.cartesian.product* in 
release 2.2.0.

Edit 11/Dec/17:  HIVE-18251 changes the default value of 
*hive.strict.checks.cartesian.product* to false in release 3.0.0.


was (Author: le...@hortonworks.com):
HIVE-15148 changes the description of *hive.strict.checks.cartesian.product* in 
release 2.2.0.

> refactor Hive strict checks to be more granular, allow order by no limit and 
> no partition filter by default for now
> ---
>
> Key: HIVE-12727
> URL: https://issues.apache.org/jira/browse/HIVE-12727
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12727.01.patch, HIVE-12727.02.patch, 
> HIVE-12727.03.patch, HIVE-12727.04.patch, HIVE-12727.05.patch, 
> HIVE-12727.06.patch, HIVE-12727.07.patch, HIVE-12727.patch
>
>
> Making strict mode the default recently appears to have broken many normal 
> queries, such as some TPCDS benchmark queries, e.g. Q85:
> Response message: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: SemanticException [Error 10041]: No partition 
> predicate found for Alias "web_sales" Table "web_returns"
> We should remove this restriction from strict mode, or change the default 
> back to non-strict. Perhaps make a 3-value parameter, nonstrict, semistrict, 
> and strict, for backward compat for people who are relying on strict already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-15120) Storage based auth: allow option to enforce write checks for external tables

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156688#comment-16156688
 ] 

Lefty Leverenz edited comment on HIVE-15120 at 12/8/17 12:30 AM:
-

In the code, the flag, 
hive.metastore.authorization.storage.check.externaltable.drop, is true by 
default. 
But In comments, it saids "The flag is set to false by default to maintain 
backward compatibility."
Comments /Doc or the flag default value, should be modified.

Edit 07/Dec/17:  Just a typo fix (flay -> flag) but also a +1 for fixing the 
parameter description.


was (Author: yuan_zac):
In the code, the flag, 
hive.metastore.authorization.storage.check.externaltable.drop, is true by 
default. 
But In comments, it saids "The flag is set to false by default to maintain 
backward compatibility."
Comments /Doc or the flay default value, should be modified.  

> Storage based auth: allow option to enforce write checks for external tables
> 
>
> Key: HIVE-15120
> URL: https://issues.apache.org/jira/browse/HIVE-15120
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Daniel Dai
>  Labels: TODOC1.3, TODOC2.2
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-15120.1.patch, HIVE-15120.2.patch, 
> HIVE-15120.3.patch, HIVE-15120.4.patch
>
>
> Under storage based authorization, we don't require write permissions on 
> table directory for external table create/drop.
> This is because external table contents are populated often from outside of 
> hive and are not written into from hive. So write access is not needed. Also, 
> we can't require write permissions to drop a table if we don't require them 
> for creation (users who created them should be able to drop them).
> However, this difference in behavior of external tables is not well 
> documented. So users get surprised to learn that drop table can be done by 
> just any user who has read access to the directory. At that point changing 
> the large number of scripts that use external tables is hard. 
> It would be good to have a user config option to have external tables to be 
> treated same as managed tables.
> The option should be off by default, so that the behavior is backward 
> compatible by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-12300) deprecate MR in Hive 2.0

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15020813#comment-15020813
 ] 

Lefty Leverenz edited comment on HIVE-12300 at 12/8/17 12:17 AM:
-

Doc note:  This needs to be documented prominently in the wiki.  Also, the 
wiki's description of *hive.execution.engine* needs to be updated (without 
removing the old description for earlier versions).

* [Configuration Properties -- hive.execution.engine | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.execution.engine]

I suggest documenting MR deprecation on the wiki's home page and in the two 
requirements sections:

* [Home | https://cwiki.apache.org/confluence/display/Hive/Home]
* [Getting Started -- Requirements | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Requirements]
* [Installing Hive | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingHive]

There might be other appropriate places for it too.  Ideas, anyone?

Update 7/Dec/17:  *hive.execution.engine* has been revised.  Other wiki pages 
still need to be revised.


was (Author: le...@hortonworks.com):
Doc note:  This needs to be documented prominently in the wiki.  Also, the 
wiki's description of *hive.execution.engine* needs to be updated (without 
removing the old description for earlier versions).

* [Configuration Properties -- hive.execution.engine | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.execution.engine]

I suggest documenting MR deprecation on the wiki's home page and in the two 
requirements sections:

* [Home | https://cwiki.apache.org/confluence/display/Hive/Home]
* [Getting Started -- Requirements | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Requirements]
* [Installing Hive | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingHive]

There might be other appropriate places for it too.  Ideas, anyone?

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.02.patch, 
> HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17333) Schema changes in HIVE-12274 for Oracle may not work for upgrade

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282667#comment-16282667
 ] 

Lefty Leverenz commented on HIVE-17333:
---

[~ngangam], branch-2 is for release 2.4.0 not 2.3.0.

Please change the fix version.

> Schema changes in HIVE-12274 for Oracle may not work for upgrade
> 
>
> Key: HIVE-17333
> URL: https://issues.apache.org/jira/browse/HIVE-17333
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-17333.1.patch, HIVE-17333.patch
>
>
> According to 
> https://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:1770086700346491686
>  (reported in HIVE-12274)
> The alter table command to change the column datatype from {{VARCHAR}} to 
> {{CLOB}} may not work. So the correct way to accomplish this is to add a new 
> temp column, copy the value from the current column, drop the current column 
> and rename the new column to old column.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18088) Add WM event traces at query level for debugging

2017-12-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281456#comment-16281456
 ] 

Lefty Leverenz commented on HIVE-18088:
---

Doc note:  This adds *hive.tez.session.events.print.summary* to HiveConf.java, 
so it needs to be documented in the wiki.

* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]

Does anything else need to be documented for this issue?

Added a TODOC3.0 label.

> Add WM event traces at query level for debugging
> 
>
> Key: HIVE-18088
> URL: https://issues.apache.org/jira/browse/HIVE-18088
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18088.1.patch, HIVE-18088.2.patch, 
> HIVE-18088.3.patch, HIVE-18088.4.patch, HIVE-18088.5.patch, 
> HIVE-18088.6.patch, HIVE-18088.7.patch, HIVE-18088.WIP.patch
>
>
> For debugging and testing purpose, expose workload manager events via /jmx 
> endpoint and print summary at the scope of query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18088) Add WM event traces at query level for debugging

2017-12-06 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-18088:
--
Labels: TODOC3.0  (was: )

> Add WM event traces at query level for debugging
> 
>
> Key: HIVE-18088
> URL: https://issues.apache.org/jira/browse/HIVE-18088
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18088.1.patch, HIVE-18088.2.patch, 
> HIVE-18088.3.patch, HIVE-18088.4.patch, HIVE-18088.5.patch, 
> HIVE-18088.6.patch, HIVE-18088.7.patch, HIVE-18088.WIP.patch
>
>
> For debugging and testing purpose, expose workload manager events via /jmx 
> endpoint and print summary at the scope of query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18127) Do not strip '--' comments from shell commands issued from CliDriver

2017-12-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281452#comment-16281452
 ] 

Lefty Leverenz commented on HIVE-18127:
---

Okay, thanks Andrew.

> Do not strip '--' comments from shell commands issued from CliDriver
> 
>
> Key: HIVE-18127
> URL: https://issues.apache.org/jira/browse/HIVE-18127
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Fix For: 3.0.0
>
> Attachments: HIVE-18127.1.patch, HIVE-18127.2.patch
>
>
> CLiDriver has the ability to run shell commands by prefixing them with '!".
> This behavior is not widely used (there are only 3 examples in .q files).
> Since HIVE-16935 started stripping comments starting with '\-\-', a shell 
> command containing '--' will not work correctly.
> Fix this by using the unstripped command for shell commands.
> Note that it would be a security hole for HS2 to allow execution of arbitrary 
> shell commands from a client command.
> Add tests to nail down correct behavior with '--' comments:
> * CliDriver should not strip strings starting with '--' in a shell command 
> (FIXED in this change).
> * HiveCli should strip '--' comments.
> * A Jdbc program should allow commands starting with "!" but these will fail 
> in the sql parser.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18127) Do not strip '--' comments from shell commands issued from CliDriver

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279798#comment-16279798
 ] 

Lefty Leverenz commented on HIVE-18127:
---

Does any of this need to be documented in the wiki?

> Do not strip '--' comments from shell commands issued from CliDriver
> 
>
> Key: HIVE-18127
> URL: https://issues.apache.org/jira/browse/HIVE-18127
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Fix For: 3.0.0
>
> Attachments: HIVE-18127.1.patch, HIVE-18127.2.patch
>
>
> CLiDriver has the ability to run shell commands by prefixing them with '!".
> This behavior is not widely used (there are only 3 examples in .q files).
> Since HIVE-16935 started stripping comments starting with '\-\-', a shell 
> command containing '--' will not work correctly.
> Fix this by using the unstripped command for shell commands.
> Note that it would be a security hole for HS2 to allow execution of arbitrary 
> shell commands from a client command.
> Add tests to nail down correct behavior with '--' comments:
> * CliDriver should not strip strings starting with '--' in a shell command 
> (FIXED in this change).
> * HiveCli should strip '--' comments.
> * A Jdbc program should allow commands starting with "!" but these will fail 
> in the sql parser.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18222) Update checkstyle rules to be less peeky

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279791#comment-16279791
 ] 

Lefty Leverenz commented on HIVE-18222:
---

If the line length gets changed, the doc needs to be updated here:

* [How To Contribute -- Coding Conventions | 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-CodingConventions]

> Update checkstyle rules to be less peeky
> 
>
> Key: HIVE-18222
> URL: https://issues.apache.org/jira/browse/HIVE-18222
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18222.01.patch
>
>
> there are a few issues with the current checkstyle.xml
> as long as the new checks are coming back red all the time; people will start 
> to ignore these checks...so I think, it would be better to make the checks 
> less strict...
> * set max linelength to 140; it looks like a more natural limit - because 
> there are classnames which are eating up line space pretty quickly... like: 
> {{PrimitiveObjectInspector}} :) 
> * make checkstyle.xml easily importable into ide (use {{config_loc}} instead 
> {{basedir}})
> * suppress generated vectorized class errors



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18029) beeline - support proper usernames based on the URL arg

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279781#comment-16279781
 ] 

Lefty Leverenz commented on HIVE-18029:
---

Okay, thanks Sergey.

> beeline - support proper usernames based on the URL arg
> ---
>
> Key: HIVE-18029
> URL: https://issues.apache.org/jira/browse/HIVE-18029
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-18029.patch
>
>
> Update:  looks like the argument connection URL is not handled consistently 
> with the connect command one, and the latter passes on user name that's 
> entered in the prompt correctly.
> So,
> {noformat}
> !connect (url) => prompt; the username on HS2 side is whatever is entered in 
> the prompt
> beeline -u (url) => anonymous (no prompt)
> !connect (url);user=foo => foo
> beeline -u (url);user=foo => anonymous
> beeline -n foo -u (url with or without the user) => foo
> {noformat}
> I'm going to add support for extracting the user from the -u argument, 
> similar to connect argument



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278319#comment-16278319
 ] 

Lefty Leverenz commented on HIVE-17954:
---

Doc note:  This needs to be documented in the wiki, including the new keywords. 
 The DDL doc will need a section for POOL commands and USER/GROUP.

* [DDL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL]
* [DDL -- Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-reservedKeywordsandReservedKeywords]

Added a TODOC3.0 label.

> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch, 
> HIVE-17954.03.patch, HIVE-17954.04.patch, HIVE-17954.05.patch, 
> HIVE-17954.06.patch, HIVE-17954.07.patch, HIVE-17954.08.patch, 
> HIVE-17954.09.patch, HIVE-17954.10.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION=`fraction`,
>   QUERY_PARALLELISM=`parallelism`,
>   SCHEDULING_POLICY=`policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Adding triggers to pools.
> ALTER POOL `resource_plan`.`pool_path` ADD TRIGGER `trigger_name`;
> ALTER POOL `resource_plan`.`pool_path` DROP TRIGGER `trigger_name`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-12-05 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17954:
--
Labels: TODOC3.0  (was: )

> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch, 
> HIVE-17954.03.patch, HIVE-17954.04.patch, HIVE-17954.05.patch, 
> HIVE-17954.06.patch, HIVE-17954.07.patch, HIVE-17954.08.patch, 
> HIVE-17954.09.patch, HIVE-17954.10.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION=`fraction`,
>   QUERY_PARALLELISM=`parallelism`,
>   SCHEDULING_POLICY=`policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Adding triggers to pools.
> ALTER POOL `resource_plan`.`pool_path` ADD TRIGGER `trigger_name`;
> ALTER POOL `resource_plan`.`pool_path` DROP TRIGGER `trigger_name`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18029) beeline - support proper usernames based on the URL arg

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278300#comment-16278300
 ] 

Lefty Leverenz commented on HIVE-18029:
---

Does this need to be documented in the wiki?

* [Beeline Command Options | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions]

> beeline - support proper usernames based on the URL arg
> ---
>
> Key: HIVE-18029
> URL: https://issues.apache.org/jira/browse/HIVE-18029
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-18029.patch
>
>
> Update:  looks like the argument connection URL is not handled consistently 
> with the connect command one, and the latter passes on user name that's 
> entered in the prompt correctly.
> So,
> {noformat}
> !connect (url) => prompt; the username on HS2 side is whatever is entered in 
> the prompt
> beeline -u (url) => anonymous (no prompt)
> !connect (url);user=foo => foo
> beeline -u (url);user=foo => anonymous
> beeline -n foo -u (url with or without the user) => foo
> {noformat}
> I'm going to add support for extracting the user from the -u argument, 
> similar to connect argument



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.

2017-12-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278213#comment-16278213
 ] 

Lefty Leverenz commented on HIVE-17600:
---

Doc note:  This adds *hive.exec.orc.buffer.size.enforce* (aka 
*orc.buffer.size.enforce*) to OrcConf.java.  I guess it should be documented in 
the ORC section of Configuration Properties, although that section needs an 
explanation about the move from HiveConf.java to OrcConf.java.

* [Configuration Properties -- ORC File Format | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat]

Added a TODOC2.2 label.

> Make OrcFile's "enforceBufferSize" user-settable.
> -
>
> Key: HIVE-17600
> URL: https://issues.apache.org/jira/browse/HIVE-17600
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2
> Fix For: 2.2.1
>
> Attachments: HIVE-17600.1-branch-2.2.patch
>
>
> This is a duplicate of ORC-238, but it applies to {{branch-2.2}}.
> Compression buffer-sizes in OrcFile are computed at runtime, except when 
> enforceBufferSize is set. The only snag here is that this flag can't be set 
> by the user.
> When runtime-computed buffer-sizes are not optimal (for some reason), the 
> user has no way to work around it by setting a custom value.
> I have a patch that we use at Yahoo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17361) Support LOAD DATA for transactional tables

2017-12-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274191#comment-16274191
 ] 

Lefty Leverenz commented on HIVE-17361:
---

Doc note:  Support of LOAD DATA for transactional tables should be documented 
in the Transactions doc.

* [Hive Transactions -- Limitations | 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Limitations]

Added a TODOC3.0 label.

(The changed description of *hive.txn.operational.properties* does not need 
documentation because the parameter is for internal use.)

> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17361.07.patch, HIVE-17361.08.patch, 
> HIVE-17361.09.patch, HIVE-17361.1.patch, HIVE-17361.10.patch, 
> HIVE-17361.11.patch, HIVE-17361.12.patch, HIVE-17361.14.patch, 
> HIVE-17361.16.patch, HIVE-17361.17.patch, HIVE-17361.19.patch, 
> HIVE-17361.2.patch, HIVE-17361.20.patch, HIVE-17361.21.patch, 
> HIVE-17361.23.patch, HIVE-17361.24.patch, HIVE-17361.25.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under [DML 
> Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
>  and [Loading files into 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:
> \\
> * Load Data performs very limited validations of the data, in particular it 
> uses the input file name which may not be in 0_0 which can break some 
> read logic.  (Certainly will for Acid).
> * It does not check the schema of the file.  This may be a non issue for Acid 
> which requires ORC which is self describing so Schema Evolution may handle 
> this seamlessly.  (Assuming Schema is not too different).
> * It does check that _InputFormat_S are compatible. 
> * Bucketed (and thus sorted) tables don't support Load Data (but only if 
> hive.strict.checks.bucketing=true (default)).  Will keep this restriction for 
> Acid.
> * Load Data supports OVERWRITE clause
> * What happens to file permissions/ownership: rename vs copy differences
> \\
> The implementation will follow the same idea as in HIVE-14988 and use a 
> base_N/ dir for OVERWRITE clause.
> \\
> How is minor compaction going to handle delta/base with original files?
> Since delta_8_8/_meta_data is created before files are moved, delta_8_8 
> becomes visible before it's populated.  Is that an issue?
> It's not since txn 8 is not committed.
> h3. Implementation Notes/Limitations (patch 25)
> * bucketed/sorted tables are not supported
> * input files names must be of the form 0_0/0_0_copy_1 - enforced. 
> (HIVE-18125)
> * Load Data creates a delta_x_x/ that contains new files
> * Load Data w/Overwrite creates a base_x/ that contains new files
> * A '_metadata_acid' file is placed in the target directory to indicate it 
> requires special handling on read
> * The input files must be 'plain' ORC files, i.e. not contain acid metadata 
> columns as would be the case if these files were copied from another Acid 
> table.  In the latter case, the ROW_IDs embedded in the data may not make 
> sense in the target table (if it's in a different cluster, for example).  
> Such files may also have a mix of committed and aborted data.
> ** this could be relaxed later by adding info to the _metadata_acid file to 
> ignore existing ROW_IDs on read.
> * ROW_IDs are attached dynamically at read time and made permanent by 
> compaction.  This is done the same way has handling of files that were 
> written to a table before it was converted to Acid.
> * Vectorization is supported



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17361) Support LOAD DATA for transactional tables

2017-12-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17361:
--
Labels: TODOC3.0  (was: )

> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17361.07.patch, HIVE-17361.08.patch, 
> HIVE-17361.09.patch, HIVE-17361.1.patch, HIVE-17361.10.patch, 
> HIVE-17361.11.patch, HIVE-17361.12.patch, HIVE-17361.14.patch, 
> HIVE-17361.16.patch, HIVE-17361.17.patch, HIVE-17361.19.patch, 
> HIVE-17361.2.patch, HIVE-17361.20.patch, HIVE-17361.21.patch, 
> HIVE-17361.23.patch, HIVE-17361.24.patch, HIVE-17361.25.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under [DML 
> Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
>  and [Loading files into 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:
> \\
> * Load Data performs very limited validations of the data, in particular it 
> uses the input file name which may not be in 0_0 which can break some 
> read logic.  (Certainly will for Acid).
> * It does not check the schema of the file.  This may be a non issue for Acid 
> which requires ORC which is self describing so Schema Evolution may handle 
> this seamlessly.  (Assuming Schema is not too different).
> * It does check that _InputFormat_S are compatible. 
> * Bucketed (and thus sorted) tables don't support Load Data (but only if 
> hive.strict.checks.bucketing=true (default)).  Will keep this restriction for 
> Acid.
> * Load Data supports OVERWRITE clause
> * What happens to file permissions/ownership: rename vs copy differences
> \\
> The implementation will follow the same idea as in HIVE-14988 and use a 
> base_N/ dir for OVERWRITE clause.
> \\
> How is minor compaction going to handle delta/base with original files?
> Since delta_8_8/_meta_data is created before files are moved, delta_8_8 
> becomes visible before it's populated.  Is that an issue?
> It's not since txn 8 is not committed.
> h3. Implementation Notes/Limitations (patch 25)
> * bucketed/sorted tables are not supported
> * input files names must be of the form 0_0/0_0_copy_1 - enforced. 
> (HIVE-18125)
> * Load Data creates a delta_x_x/ that contains new files
> * Load Data w/Overwrite creates a base_x/ that contains new files
> * A '_metadata_acid' file is placed in the target directory to indicate it 
> requires special handling on read
> * The input files must be 'plain' ORC files, i.e. not contain acid metadata 
> columns as would be the case if these files were copied from another Acid 
> table.  In the latter case, the ROW_IDs embedded in the data may not make 
> sense in the target table (if it's in a different cluster, for example).  
> Such files may also have a mix of committed and aborted data.
> ** this could be relaxed later by adding info to the _metadata_acid file to 
> ignore existing ROW_IDs on read.
> * ROW_IDs are attached dynamically at read time and made permanent by 
> compaction.  This is done the same way has handling of files that were 
> written to a table before it was converted to Acid.
> * Vectorization is supported



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2017-11-30 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273845#comment-16273845
 ] 

Lefty Leverenz commented on HIVE-14792:
---

Doc note:  This adds *hive.optimize.update.table.properties.from.serde* and 
*hive.optimize.update.table.properties.from.serde.list* to HiveConf.java, so 
they need to be documented in the wiki.

* [Avro SerDe | https://cwiki.apache.org/confluence/display/Hive/AvroSerDe]
* [Configuration Properties -- SerDes | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-SerDes]

Added TODOC2.2 and TODOC2.4 labels.

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2017-11-30 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14792:
--
Labels: TODOC2.2 TODOC2.4  (was: TODOC3.0)

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2017-11-30 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14792:
--
Labels: TODOC3.0  (was: )

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC3.0
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14487) Add REBUILD statement for materialized views

2017-11-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266274#comment-16266274
 ] 

Lefty Leverenz commented on HIVE-14487:
---

Doc note:  This needs to be documented in the wiki.

* [DDL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL]

Added a TODOC3.0 label.

> Add REBUILD statement for materialized views
> 
>
> Key: HIVE-14487
> URL: https://issues.apache.org/jira/browse/HIVE-14487
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-14487.01.patch, HIVE-14487.02.patch, 
> HIVE-14487.03.patch, HIVE-14487.04.patch, HIVE-14487.patch
>
>
> Support for rebuilding existing materialized views. The statement is the 
> following:
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name REBUILD;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14487) Add REBUILD statement for materialized views

2017-11-26 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14487:
--
Labels: TODOC3.0  (was: )

> Add REBUILD statement for materialized views
> 
>
> Key: HIVE-14487
> URL: https://issues.apache.org/jira/browse/HIVE-14487
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-14487.01.patch, HIVE-14487.02.patch, 
> HIVE-14487.03.patch, HIVE-14487.04.patch, HIVE-14487.patch
>
>
> Support for rebuilding existing materialized views. The statement is the 
> following:
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name REBUILD;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17902) add notions of default pool and start adding unmanaged mapping

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248783#comment-16248783
 ] 

Lefty Leverenz edited comment on HIVE-17902 at 11/20/17 4:28 AM:
-

Doc note:  This adds *hive.metastore.wm.default.pool.size* to HiveConf.java, so 
it needs to be documented in the wiki.  (Perhaps the LLAP section of 
Configuration Properties will have a subsection for workload management.)

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC3.0 label.

Update 19/Nov/17:  Also document the non-reserved keywords DEFAULT and POOL for 
3.0.0 in the DDL doc.

* [DDL -- Keywords | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-reservedKeywordsandReservedKeywords]


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.metastore.wm.default.pool.size* to HiveConf.java, so 
it needs to be documented in the wiki.  (Perhaps the LLAP section of 
Configuration Properties will have a subsection for workload management.)

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC3.0 label.

> add notions of default pool and start adding unmanaged mapping
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.05.patch, 
> HIVE-17902.06.patch, HIVE-17902.07.patch, HIVE-17902.08.patch, 
> HIVE-17902.09.patch, HIVE-17902.10.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17932) Remove option to control partition level basic stats fetching

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258798#comment-16258798
 ] 

Lefty Leverenz commented on HIVE-17932:
---

Thanks Zoltan, I've removed the TODOC3.0 label.

> Remove option to control partition level basic stats fetching
> -
>
> Key: HIVE-17932
> URL: https://issues.apache.org/jira/browse/HIVE-17932
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-17932.01.patch
>
>
> disabling the fetching of partition 
> stats({{hive.stats.fetch.partition.stats}}) may cause problematic cases to 
> arise for partitioned tables...the user might just want to disable the cbo 
> instead tweaking the fetching of partition stats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17965) Remove HIVELIMITTABLESCANPARTITION support

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258797#comment-16258797
 ] 

Lefty Leverenz commented on HIVE-17965:
---

Thanks Zoltan, I've removed the TODOC3.0 label.

> Remove HIVELIMITTABLESCANPARTITION support
> --
>
> Key: HIVE-17965
> URL: https://issues.apache.org/jira/browse/HIVE-17965
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-17965.01.patch
>
>
> HIVE-13884 marked it as deprecated



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17965) Remove HIVELIMITTABLESCANPARTITION support

2017-11-19 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17965:
--
Labels:   (was: TODOC3.0)

> Remove HIVELIMITTABLESCANPARTITION support
> --
>
> Key: HIVE-17965
> URL: https://issues.apache.org/jira/browse/HIVE-17965
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-17965.01.patch
>
>
> HIVE-13884 marked it as deprecated



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17932) Remove option to control partition level basic stats fetching

2017-11-19 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17932:
--
Labels:   (was: TODOC3.0)

> Remove option to control partition level basic stats fetching
> -
>
> Key: HIVE-17932
> URL: https://issues.apache.org/jira/browse/HIVE-17932
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-17932.01.patch
>
>
> disabling the fetching of partition 
> stats({{hive.stats.fetch.partition.stats}}) may cause problematic cases to 
> arise for partitioned tables...the user might just want to disable the cbo 
> instead tweaking the fetching of partition stats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258775#comment-16258775
 ] 

Lefty Leverenz commented on HIVE-14495:
---

Doc note:  This needs to be documented in the wiki.

* [DDL -- SHOW | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Show]

Added a TODOC3.0 label.

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-14495.01.patch, HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15018) ALTER rewriting flag in materialized view

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258776#comment-16258776
 ] 

Lefty Leverenz commented on HIVE-15018:
---

Doc note:  This needs to be documented in the wiki.

* [DDL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL]

Added a TODOC3.0 label.

> ALTER rewriting flag in materialized view 
> --
>
> Key: HIVE-15018
> URL: https://issues.apache.org/jira/browse/HIVE-15018
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15018.01.patch, HIVE-15018.patch
>
>
> We should extend the ALTER statement in case we want to change the rewriting 
> behavior of the materialized view after we have created it.
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE;
> {code}
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-19 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14495:
--
Labels: TODOC3.0  (was: )

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-14495.01.patch, HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15018) ALTER rewriting flag in materialized view

2017-11-19 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15018:
--
Labels: TODOC3.0  (was: )

> ALTER rewriting flag in materialized view 
> --
>
> Key: HIVE-15018
> URL: https://issues.apache.org/jira/browse/HIVE-15018
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15018.01.patch, HIVE-15018.patch
>
>
> We should extend the ALTER statement in case we want to change the rewriting 
> behavior of the materialized view after we have created it.
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE;
> {code}
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258760#comment-16258760
 ] 

Lefty Leverenz commented on HIVE-17964:
---

Doc note:  This adds *hive.spark.rsc.conf.list* to HiveConf.java, so it needs 
to be documented in the wiki.

* [Configuration Properties -- Spark | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark]

Added a TODOC3.0 label.

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17964.1.patch, HIVE-17964.2.patch, 
> HIVE-17964.3.patch
>
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-19 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17964:
--
Labels: TODOC3.0  (was: )

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17964.1.patch, HIVE-17964.2.patch, 
> HIVE-17964.3.patch
>
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14560) Support exchange partition between s3 and hdfs tables

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258756#comment-16258756
 ] 

Lefty Leverenz commented on HIVE-14560:
---

Should this be documented in the wiki, or is it just a bug fix?

> Support exchange partition between s3 and hdfs tables
> -
>
> Key: HIVE-14560
> URL: https://issues.apache.org/jira/browse/HIVE-14560
> Project: Hive
>  Issue Type: Bug
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 3.0.0
>
> Attachments: HIVE-14560.02.patch, HIVE-14560.patch
>
>
> {code}
> alter table s3_tbl exchange partition (country='USA', state='CA') with table 
> hdfs_tbl;
> {code}
> results in:
> {code}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got 
> exception: java.lang.IllegalArgumentException Wrong FS: 
> s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: 
> hdfs://localhost:9000) (state=08S01,code=1)
> {code}
> because the check for whether the s3 destination table path exists occurs on 
> the hdfs filesystem.
> Furthermore, exchanging between s3 to hdfs fails because the hdfs rename 
> operation is not supported across filesystems. Fix uses copy + deletion in 
> the case that the file systems differ.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18056) CachedStore: Have a whitelist/blacklist config to allow selective caching of tables/partitions and allow read while prewarming

2017-11-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258730#comment-16258730
 ] 

Lefty Leverenz commented on HIVE-18056:
---

Doc note:  This adds *hive.metastore.cached.rawstore.cached.object.whitelist* 
and *hive.metastore.cached.rawstore.cached.object.blacklist* to HiveConf.java, 
so they need to be documented in the wiki.

* [Configuration Properties -- Metastore | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore]

General documentation is also needed for CachedStore.

Added a TODOC3.0 label.

> CachedStore: Have a whitelist/blacklist config to allow selective caching of 
> tables/partitions and allow read while prewarming
> --
>
> Key: HIVE-18056
> URL: https://issues.apache.org/jira/browse/HIVE-18056
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Daniel Dai
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18056.1.patch, HIVE-18056.2.patch, 
> HIVE-18056.3.patch, HIVE-18056.4.patch, HIVE-18056.5.patch, 
> HIVE-18056.6.patch, HIVE-18056.7.patch, HIVE-18056.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17489) Separate client-facing and server-side Kerberos principals, to support HA

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250954#comment-16250954
 ] 

Lefty Leverenz commented on HIVE-17489:
---

Thanks for the doc, [~mithun].  I did some minor editing, documented the other 
new parameter (*hive.server2.authentication.client.kerberos.principal*), and 
added cross-references between them.

Please review and let me know if the cross-references were a good idea or not.  
(Does HA mean High Availability?)

* [hive.metastore.client.kerberos.principal | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.client.kerberos.principal]
* [hive.server2.authentication.client.kerberos.principal | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.client.kerberos.principal]

> Separate client-facing and server-side Kerberos principals, to support HA
> -
>
> Key: HIVE-17489
> URL: https://issues.apache.org/jira/browse/HIVE-17489
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Mithun Radhakrishnan
>Assignee: Thiruvel Thirumoolan
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-17489.2-branch-2.patch, HIVE-17489.2.patch, 
> HIVE-17489.2.patch, HIVE-17489.3-branch-2.patch, HIVE-17489.3.patch, 
> HIVE-17489.4-branch-2.patch, HIVE-17489.4.patch
>
>
> On deployments of the Hive metastore where a farm of servers is fronted by a 
> VIP, the hostname of the VIP (e.g. {{mycluster-hcat.blue.myth.net}}) will 
> differ from the actual boxen in the farm (.e.g 
> {{mycluster-hcat-\[0..3\].blue.myth.net}}).
> Such a deployment messes up Kerberos auth, with principals like 
> {{hcat/mycluster-hcat.blue.myth@grid.myth.net}}. Host-based checks will 
> disallow servers behind the VIP from using the VIP's hostname in its 
> principal when accessing, say, HDFS.
> The solution would be to decouple the server-side principal (used to access 
> other services like HDFS as a client) from the client-facing principal (used 
> from Hive-client, BeeLine, etc.).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-14497) Fine control for using materialized views in rewriting

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756700#comment-15756700
 ] 

Lefty Leverenz edited comment on HIVE-14497 at 11/14/17 5:04 AM:
-

Doc note:  This needs to be documented with a new section in the DDL wikidoc, 
perhaps after Create/Drop/Alter View.

* [Hive DDL | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-HiveDataDefinitionLanguage]

Added a TODOC2.2 label.

Update 14/Nov/17:  Changed the label to TODOC2.3.


was (Author: le...@hortonworks.com):
Doc note:  This needs to be documented with a new section in the DDL wikidoc, 
perhaps after Create/Drop/Alter View.

* [Hive DDL | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-HiveDataDefinitionLanguage]

Added a TODOC2.2 label.

> Fine control for using materialized views in rewriting
> --
>
> Key: HIVE-14497
> URL: https://issues.apache.org/jira/browse/HIVE-14497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.3
> Fix For: 2.3.0
>
>
> Follow-up of HIVE-14495. Since the number of materialized views in the system 
> might grow very large, and query rewriting using materialized views might be 
> very expensive, we need to include a mechanism to enable/disable materialized 
> views for query rewriting.
> Thus, we should extend the CREATE MATERIALIZED VIEW statement as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED]
>   [ENABLE REWRITE] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14497) Fine control for using materialized views in rewriting

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14497:
--
Labels: TODOC2.3  (was: TODOC2.2)

> Fine control for using materialized views in rewriting
> --
>
> Key: HIVE-14497
> URL: https://issues.apache.org/jira/browse/HIVE-14497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.3
> Fix For: 2.3.0
>
>
> Follow-up of HIVE-14495. Since the number of materialized views in the system 
> might grow very large, and query rewriting using materialized views might be 
> very expensive, we need to include a mechanism to enable/disable materialized 
> views for query rewriting.
> Thus, we should extend the CREATE MATERIALIZED VIEW statement as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED]
>   [ENABLE REWRITE] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17809) Implement per pool trigger validation and move sessions across pools

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250849#comment-16250849
 ] 

Lefty Leverenz commented on HIVE-17809:
---

Should this be documented in the wiki?

> Implement per pool trigger validation and move sessions across pools
> 
>
> Key: HIVE-17809
> URL: https://issues.apache.org/jira/browse/HIVE-17809
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17809.1.patch, HIVE-17809.2.patch, 
> HIVE-17809.3.patch, HIVE-17809.4.patch, HIVE-17809.5.patch, HIVE-17809.6.patch
>
>
> HIVE-17508 trigger validation is applied for all pools at once. This is 
> follow up to implement trigger validation at per pool level. 
> This should also implement resolution for multiple applicable actions, as per 
> the RB discussion



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591168#comment-15591168
 ] 

Lefty Leverenz edited comment on HIVE-14878 at 11/14/17 4:21 AM:
-

How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.

* [DDL -- table properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]


was (Author: le...@hortonworks.com):
How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14878:
--
Labels: TODOC3.0  (was: )

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591168#comment-15591168
 ] 

Lefty Leverenz edited comment on HIVE-14878 at 11/14/17 4:18 AM:
-

How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.


was (Author: le...@hortonworks.com):
How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14878:
--
Comment: was deleted

(was: No doc needed:  This changes the description of 
*hive.txn.operational.properties* but it doesn't need to be documented because 
it's for internal use only.  (See HIVE-14035 comments, 21-22 Aug. 2016.)

HIVE-17458 changes the description again.)

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14535) add insert-only ACID tables to Hive

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250768#comment-16250768
 ] 

Lefty Leverenz commented on HIVE-14535:
---

Doc note:  HIVE-15212 merged branch-14535 to master for release 3.0.0, so 
general documentation for this feature is needed in the wiki.

These configuration properties were added or changed by the merge:

* *hive.mm.avoid.s3.globstatus* (HIVE-14953) -- new config
* *hive.exim.test.mode* (HIVE-15019) -- new config
* *hive.txn.operational.properties* (HIVE-14878) -- description changed; 
internal so no doc needed

Added a TODOC3.0 label.

> add insert-only ACID tables to Hive 
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.
> Update: we ended up going with sequence number based implementation
> Update #2: this feature has been partially merged with ACID; the new table 
> type is insert_only ACID, and the difference from the regular ACID is that it 
> only supports inserts on one hand; and that it has no restrictions on file 
> format, table type (bucketing), and much fewer restrictions on other 
> operations (export/import, list bucketing, etc.)
> Currently some features that used to work when it was separated are not 
> integrated properly; integration of these features is the remaining work in 
> this JIRA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15019) handle import for MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250764#comment-16250764
 ] 

Lefty Leverenz commented on HIVE-15019:
---

Doc note:  This adds *hive.exim.test.mode* to HiveConf.java and branch-14535 
has been merged to master for release 3.0.0 by HIVE-15212, so the wiki needs to 
be updated.

Although most test configs aren't documented, this one is different because it 
doesn't begin with "hive.test" and so wouldn't show up in a simple search.  
Therefore I recommend including it in the Test Properties section of 
Configuration Properties, perhaps with a cross-reference from the Transactions 
section (or a new subsection, if one is added for 
*hive.mm.avoid.s3.globstatus*).

* [Configuration Properties -- Test Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TestProperties]
* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC3.0 label.

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15019) handle import for MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15019:
--
Labels: TODOC3.0  (was: )

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-15212) merge branch into master

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248816#comment-16248816
 ] 

Lefty Leverenz edited comment on HIVE-15212 at 11/14/17 3:10 AM:
-

Okay, thanks Sergey.

So far I've only found one configuration parameter added to master by this 
merge (*hive.mm.avoid.s3.globstatus* in HIVE-14953) but there may be a few more.

Update 13/Nov/17:  The merge also added *hive.exim.test.mode* (HIVE-15019) and 
changed the description of *hive.txn.operational.properties* (HIVE-14878) but 
the latter is internal and so doesn't need to be documented.  Most test configs 
aren't documented but perhaps *hive.exim.test.mode* should be because it 
wouldn't show up in a search for "hive.test.*" configs.


was (Author: le...@hortonworks.com):
Okay, thanks Sergey.

So far I've only found one configuration parameter added to master by this 
merge (*hive.mm.avoid.s3.globstatus* in HIVE-14953) but there may be a few more.

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch, 
> HIVE-15212.16.patch, HIVE-15212.17.patch, HIVE-15212.18.patch, 
> HIVE-15212.19.patch, HIVE-15212.20.patch, HIVE-15212.21.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250738#comment-16250738
 ] 

Lefty Leverenz commented on HIVE-14953:
---

Doc note:  This adds *hive.mm.avoid.s3.globstatus* to HiveConf.java and 
branch-14535 has been merged to master for release 3.0.0 by HIVE-15212, so the 
wiki needs to be updated.

I'm not sure where *hive.mm.avoid.s3.globstatus* belongs in Configuration 
Properties.  Perhaps the Transactions section should have a subsection, 
although so far this is the only new parameter that needs to be documented.

* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC3.0.0 label.

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14953:
--
Labels: TODOC3.0  (was: )

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   5   6   7   8   9   10   >