[jira] [Commented] (SPARK-24176) The hdfs file path with wildcard can not be identified when loading data

kevin yu (JIRA) Mon, 07 May 2018 11:09:15 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-24176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466263#comment-16466263
 ]


kevin yu commented on SPARK-24176:
----------------------------------

I am looking at this one, will provide a proposal fix soon. 

> The hdfs file path with wildcard can not be identified when loading data
> ------------------------------------------------------------------------
>
>                 Key: SPARK-24176
>                 URL: https://issues.apache.org/jira/browse/SPARK-24176
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>         Environment: OS: SUSE11
> Spark Version:2.3
>            Reporter: ABHISHEK KUMAR GUPTA
>            Priority: Minor
>
> # Launch spark-sql
>  # create table wild1 (time timestamp, name string, isright boolean, 
> datetoday date, num binary, height double, score float, decimaler 
> decimal(10,0), id tinyint, age int, license bigint, length smallint) row 
> format delimited fields terminated by ',' stored as textfile;
>  # loaded data in table as below and it failed some cases not consistent
>  # load data inpath '/user/testdemo1/user1/?ype* ' into table wild1; - Success
> load data inpath '/user/testdemo1/user1/t??eddata60.txt' into table wild1; - 
> *Failed*
> load data inpath '/user/testdemo1/user1/?ypeddata60.txt' into table wild1; - 
> Success
> Exception as below
> > load data inpath '/user/testdemo1/user1/t??eddata61.txt' into table wild1;
> 2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_database: one
> 2018-05-04 13:16:25 INFO audit:371 - ugi=spark/had...@hadoop.com 
> ip=unknown-ip-addr cmd=get_database: one
> 2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_table : db=one tbl=wild1
> 2018-05-04 13:16:25 INFO audit:371 - ugi=spark/had...@hadoop.com 
> ip=unknown-ip-addr cmd=get_table : db=one tbl=wild1
> 2018-05-04 13:16:25 INFO HiveMetaStore:746 - 0: get_table : db=one tbl=wild1
> 2018-05-04 13:16:25 INFO audit:371 - ugi=spark/had...@hadoop.com 
> ip=unknown-ip-addr cmd=get_table : db=one tbl=wild1
> *Error in query: LOAD DATA input path does not exist: 
> /user/testdemo1/user1/t??eddata61.txt;*
> spark-sql>
> Behavior is not consistent. Need to fix with all combination of wild card 
> char as it is not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24176) The hdfs file path with wildcard can not be identified when loading data

Reply via email to