[ 
https://issues.apache.org/jira/browse/PARQUET-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056366#comment-16056366
 ] 

Ashima Sood commented on PARQUET-1036:
--------------------------------------

Also, when using spark-sql on unix. the results returned are as below:

spark-sql> select * from <table> limit 5;
17/06/20 19:53:58 INFO SparkSqlParser: Parsing command: select * from <table> 
limit 5
17/06/20 19:53:58 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
10.107.206.68:40150 in memory (size: 2.1 KB, free: 413.9 MB)
17/06/20 19:53:58 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
ip-10-107-206-78.fmrco.com:33367 in memory (size: 2.1 KB, free: 2.8 GB)
17/06/20 19:53:58 INFO ContextCleaner: Cleaned accumulator 0
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:53:58 INFO CatalystSqlParser: Parsing command: string
17/06/20 19:54:00 INFO FileSourceStrategy: Pruning directories with:
17/06/20 19:54:00 INFO FileSourceStrategy: Post-Scan Filters:
17/06/20 19:54:00 INFO FileSourceStrategy: Output Data Schema: struct<date: 
string, row_id: string, status: string, gen_time: string, gen_month: string ... 
108 more fields>
17/06/20 19:54:00 INFO FileSourceStrategy: Pushed Filters:
17/06/20 19:54:00 WARN Utils: Truncated the string representation of a plan 
since it was too large. This behavior can be adjusted by setting 
'spark.debug.maxToStringFields' in SparkEnv.conf.
17/06/20 19:54:00 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 350.8 KB, free 413.6 MB)
17/06/20 19:54:00 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in 
memory (estimated size 31.9 KB, free 413.6 MB)
17/06/20 19:54:00 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 
10.107.206.68:40150 (size: 31.9 KB, free: 413.9 MB)
17/06/20 19:54:00 INFO SparkContext: Created broadcast 1 from processCmd at 
CliDriver.java:376
17/06/20 19:54:00 INFO FileSourceScanExec: Planning scan with bin packing, max 
size: 4194304 bytes, open cost is considered as scanning 4194304 bytes.
17/06/20 19:54:00 INFO SparkContext: Starting job: processCmd at 
CliDriver.java:376
17/06/20 19:54:00 INFO DAGScheduler: Got job 1 (processCmd at 
CliDriver.java:376) with 1 output partitions
17/06/20 19:54:00 INFO DAGScheduler: Final stage: ResultStage 1 (processCmd at 
CliDriver.java:376)
17/06/20 19:54:00 INFO DAGScheduler: Parents of final stage: List()
17/06/20 19:54:00 INFO DAGScheduler: Missing parents: List()
17/06/20 19:54:00 INFO DAGScheduler: Submitting ResultStage 1 
(MapPartitionsRDD[6] at processCmd at CliDriver.java:376), which has no missing 
parents
17/06/20 19:54:00 INFO MemoryStore: Block broadcast_2 stored as values in 
memory (estimated size 18.8 KB, free 413.5 MB)
17/06/20 19:54:00 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in 
memory (estimated size 7.5 KB, free 413.5 MB)
17/06/20 19:54:00 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 
10.107.206.68:40150 (size: 7.5 KB, free: 413.9 MB)
17/06/20 19:54:00 INFO SparkContext: Created broadcast 2 from broadcast at 
DAGScheduler.scala:996
17/06/20 19:54:00 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 1 (MapPartitionsRDD[6] at processCmd at CliDriver.java:376)
17/06/20 19:54:00 INFO YarnScheduler: Adding task set 1.0 with 1 tasks
17/06/20 19:54:00 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, 
ip-10-107-206-78.fmrco.com, executor 1, partition 0, RACK_LOCAL, 6573 bytes)
17/06/20 19:54:00 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 
ip-10-107-206-78.fmrco.com:33367 (size: 7.5 KB, free: 2.8 GB)
17/06/20 19:54:01 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 
ip-10-107-206-78.fmrco.com:33367 (size: 31.9 KB, free: 2.8 GB)
17/06/20 19:54:05 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) 
in 4476 ms on ip-10-107-206-78.fmrco.com (executor 1) (1/1)
17/06/20 19:54:05 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all 
completed, from pool
17/06/20 19:54:05 INFO DAGScheduler: ResultStage 1 (processCmd at 
CliDriver.java:376) finished in 4.477 s
17/06/20 19:54:05 INFO DAGScheduler: Job 1 finished: processCmd at 
CliDriver.java:376, took 4.518839 s
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULLNULL     NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULLNULL     NULL    NULL    NULL    
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
Time taken: 6.702 seconds, Fetched 5 row(s)
17/06/20 19:54:05 INFO CliDriver: Time taken: 6.702 seconds, Fetched 5 row(s)


> parquet file created via pyarrow 0.4.0 ; version 1.0 - incompatible with Spark
> ------------------------------------------------------------------------------
>
>                 Key: PARQUET-1036
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1036
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Ashima Sood
>            Priority: Blocker
>
> using spark sql unable to read parquet file and shows null values. whereas 
> hive reads the values fine.
> 17/06/19 17:50:36 WARN CorruptStatistics: Ignoring statistics because 
> created_by could not be parsed (see PARQUET-251): parquet-cpp version 1.0.0
> org.apache.parquet.VersionParser$VersionParseException: Could not parse 
> created_by: parquet-cpp version 1.0.0 using format: (.+) version ((.*) 
> )?\(build ?(.*)\)
>                 at 
> org.apache.parquet.VersionParser.parse(VersionParser.java:112)
>                 at 
> org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:60)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to