Debdut Mukherjee created SPARK-28364:
----------------------------------------

             Summary: Unable to read complete data of an external hive table 
stored as ORC & pointing to a managed table's data files that is getting stored 
in sub-directories.
                 Key: SPARK-28364
                 URL: https://issues.apache.org/jira/browse/SPARK-28364
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0
         Environment: !image-2019-07-12-13-42-29-304.png!
            Reporter: Debdut Mukherjee


Unable to read complete data of an external hive table stored as ORC & pointing 
to a managed table's data files that is getting stored in sub-directories.

The count also does not match unless the path is given with a *.

*Example This works:-*

"adl://<adls_name>.azuredatalakestore.net/clusters/<cluster 
path>/hive/warehouse/db2.db/tbl1/***"  

But the above creates a blank directory named *** in ADLS(Azure Data Lake Store)

 

The below one does not work when a SELECT COUNT(*) is executed on this external 
file. It gives partial count.

CREATE EXTERNAL TABLE IF NOT EXISTS db1.tbl1 (

Col_1 string,

Col_2 string

STORED AS ORC

LOCATION "adl://<adls_name>.azuredatalakestore.net/clusters/<cluster 
path>/hive/warehouse/db2.db/tbl1/"

)

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to