[ 
https://issues.apache.org/jira/browse/PARQUET-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338078#comment-15338078
 ] 

Deepak Majeti commented on PARQUET-474:
---------------------------------------

I don't think we should make parquet-cpp library handle multi-threading at all. 
Managing synchronization will be an unnecessary complication and is usually 
handled well by the downstream application.
For parallelism, downstream applications can create multiple instances of the 
reader, one per thread.
For efficiency, we could implement an API to share read-only resources like 
"file footer" among the instances.

This is what ORC library does too and it works very well.
https://github.com/apache/orc

> InputStream and RandomAccessdSource classes are not threadsafe
> --------------------------------------------------------------
>
>                 Key: PARQUET-474
>                 URL: https://issues.apache.org/jira/browse/PARQUET-474
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Wes McKinney
>            Assignee: Wes McKinney
>
> We need to ensure that files can be processed in multithreaded applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to