[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2020-01-02 Thread Sofia (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17007014#comment-17007014
 ] 

Sofia commented on HIVE-21367:
--

[~sophie1] we re still using this workaround with the same version. I don't 
have any idea if this bug has been fixed in the latest version.

 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2020-01-01 Thread sophie (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006465#comment-17006465
 ] 

sophie commented on HIVE-21367:
---

has this bug been fixed in a later version?  Or need we still do the same 
workaround? 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-05 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784389#comment-16784389
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Thanks [~starphin] for this Workaround :) !

Maybe we should see other parameter to make fetchTask work.

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-05 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784226#comment-16784226
 ] 

Sofia commented on HIVE-21367:
--

Both sqoop and spark makes subdirs.

Setting property hive.fetch.task.conversion= to none resolved the issue.

Thanks [~starphin]

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783978#comment-16783978
 ] 

star commented on HIVE-21367:
-

Can you verify which one , sqoop or spark, makes subdirs?

Also setting hive.fetch.task.conversion to none can disable fetchtask and get 
correct result.

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783485#comment-16783485
 ] 

Sofia commented on HIVE-21367:
--

The target table is from two different sources :
 * {color:#33}*From SQOOP*{color}: when loading tables we use the following 
code.

{code:java}
sqoop import --connect ${CONNECTION} \
--username ${USER} \
--password ${PASSWORD} \
--table $1 \
--hive-database $2 \
--hive-table ${TBNAME} \
--hive-import \
--as-orcfile \
--hive-overwrite \
-m 1 \
--delete-target-dir 

{code}
 *  *From SPARK*: when processing the data, we store the output as a table in 
hive using the following code.
{code:java}
df.write
  .mode(mode)
  .format(HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR)
  .option("table",tableName)
  .save(){code}
How do we load the data into the root path of the target table in each case ? 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783445#comment-16783445
 ] 

star commented on HIVE-21367:
-

Seems it only takes effect in mapreduce, not fetchtask. I have to figure out 
why hive don't support such configuration. Maybe there are other considerations 
I don't notice at moment. By the way, why do you make a subdirectories when 
using sqoop? You can load data to the root path of the target table.

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783299#comment-16783299
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Hi,

we've set the property mapred.input.dir.recursive to true using Ambari but 
unfortunately the problem is still the same.

is this property concern also FetchTask ?

!mapred_input_dir_recursive.png!

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783264#comment-16783264
 ] 

star commented on HIVE-21367:
-

Basically hive will change a simple select into 'FetchTask' which will be 
executed locally(no map reduce task). While complicated select will be executed 
as a mapreduce( or tez) task, which supports subdirs. FetchTask differ from 
mapreduce。

Setting mapred.input.dir.recursive to true in hive-site.xml is expected to 
solve the problem. 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783236#comment-16783236
 ] 

Sofia commented on HIVE-21367:
--

Hi [~starphin],  why do hive behave that way and create subdirs when executing 
a simple select ? Is there any workaround for that ?

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783205#comment-16783205
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Hi,

I just moved the files after tbl1, and it gives a correct result !
{code:java}
[hdfs@data1 ~]$ hadoop fs -cp 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/* 
/warehouse/tablespace/managed/hive/db1.db/tbl1/
[hdfs@data1 ~] hadoop fs -rm -r 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_
{code}
so the question now is : how hive can support subdirectories ?

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783114#comment-16783114
 ] 

star commented on HIVE-21367:
-

Or you can mv files from subdirs to the root dir of the table. I suspect that 
it is due to subdirs. Hive do not support subdirs by default.

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-02 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782443#comment-16782443
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Hi, 

I tried to add this parameter when I run the query but with error :
{code:java}
/--
Running the Query with parameters
/-
- set hive.input.dir.recursive=true;
- set hive.mapred.supports.subdirectories=true;
- set hive.supports.subdirectories=true;
- set mapred.input.dir.recursive=true;

Error: Error while processing statement: hive configuration 
hive.input.dir.recursive does not exists. (state=42000,code=1)
Error: Error while processing statement: Cannot modify 
hive.mapred.supports.subdirectories at runtime. It is not in list of params 
that are allowed to be modified at runtime (state=42000,code=1)
Error: Error while processing statement: hive configuration 
hive.supports.subdirectories does not exists. (state=42000,code=1)
Error: Error while processing statement: Cannot modify 
mapred.input.dir.recursive at runtime. It is not in list of params that are 
allowed to be modified at runtime (state=42000,code=1)
{code}
And also, I tried to add them inhive-site.xml and rerun the query but without 
any changes...

Below full log 

{code}
[data@data1 ~]$ hive -e "set hive.input.dir.recursive=true; set 
hive.mapred.supports.subdirectories=true; set 
hive.supports.subdirectories=true; set mapred.input.dir.recursive=true;select 
code from db1.tbl1 where code = '123';"
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to 
jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
19/03/02 17:12:12 [main]: INFO jdbc.HiveConnection: Connected to data2:1
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Error: Error while processing statement: hive configuration 
hive.input.dir.recursive does not exists. (state=42000,code=1)
Error: Error while processing statement: Cannot modify 
hive.mapred.supports.subdirectories at runtime. It is not in list of params 
that are allowed to be modified at runtime (state=42000,code=1)
Error: Error while processing statement: hive configuration 
hive.supports.subdirectories does not exists. (state=42000,code=1)
Error: Error while processing statement: Cannot modify 
mapred.input.dir.recursive at runtime. It is not in list of params that are 
allowed to be modified at runtime (state=42000,code=1)
Closing: 0: 
jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2

{code}

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781898#comment-16781898
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Hi thank u I will try that !

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781837#comment-16781837
 ] 

star commented on HIVE-21367:
-

try set parameters as following: 

set hive.input.dir.recursive=true; 
set hive.mapred.supports.subdirectories=true; 
set hive.supports.subdirectories=true; 
set mapred.input.dir.recursive=true;

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781830#comment-16781830
 ] 

star commented on HIVE-21367:
-

You can make some meaningless data and reproduce the bug. 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
> 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781797#comment-16781797
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Here is the HDFS Path to this table 
{code:java}
[hdfs@data1 root]$ hadoop fs -ls -e 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_
Found 200 items
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/00_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/01_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/02_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/03_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/04_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/05_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/06_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/07_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/08_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/09_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/10_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/11_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/12_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/13_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/14_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/15_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/16_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/17_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/18_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/19_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/20_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/21_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/22_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/23_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/24_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/25_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/26_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/27_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/28_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/29_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/30_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/31_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/32_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 
/warehouse/tablespace/managed/hive/db1.db/tbl1/delta_001_001_/33_0
-rw-rw+ 3 hive hadoop 574 2019-03-01 09:06 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread LEMBARKI Mohamed Amine (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781791#comment-16781791
 ] 

LEMBARKI Mohamed Amine commented on HIVE-21367:
---

Hi,
{code:java}
CREATE TABLE db1.tbl1(
code STRING(2147483647)
);
{code}
*For the HDFS File is in ORC formatted here is the table description and it 
contains a confidential data:*
{code:java}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to 
jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
19/03/01 10:40:34 [main]: INFO jdbc.HiveConnection: Connected to data2:1
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
INFO : Compiling 
command(queryId=hive_20190301104027_818ae55f-3e3f-4754-8706-0279b693d9a8): 
describe extended db1.tbl1
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, 
type:string, comment:from deserializer), FieldSchema(name:data_type, 
type:string, comment:from deserializer), FieldSchema(name:comment, type:string, 
comment:from deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=hive_20190301104027_818ae55f-3e3f-4754-8706-0279b693d9a8); Time 
taken: 0.044 seconds
INFO : Executing 
command(queryId=hive_20190301104027_818ae55f-3e3f-4754-8706-0279b693d9a8): 
describe extended db1.tbl1
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=hive_20190301104027_818ae55f-3e3f-4754-8706-0279b693d9a8); Time 
taken: 0.024 seconds
INFO : OK
+-++--+
| col_name | data_type | comment |
+-++--+
| code | string | |
| | NULL | NULL |
| Detailed Table Information | Table(tableName:tbl1, dbName:db1, 
owner:anonymous, createTime:1551431182, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:code, type:string, comment:null)], 
location:hdfs://data1:8020/warehouse/tablespace/managed/hive/db1.db/tbl1, 
inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{totalSize=217593448, rawDataSize=0, numRows=0, 
transactional_properties=default, numFiles=200, 
transient_lastDdlTime=1551431187, bucketing_version=2, transactional=true}, 
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, 
rewriteEnabled:false, catName:hive, ownerType:USER, writeId:1) | |
+-++--+
41 rows selected (0,157 seconds)
Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
Closing: 0: 
jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2{code}

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query 

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-01 Thread star (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781766#comment-16781766
 ] 

star commented on HIVE-21367:
-

Could you please upload hdfs file corresponding to that table and show create 
table ?

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: