[ 
https://issues.apache.org/jira/browse/PARQUET-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339582#comment-15339582
 ] 

Deepak Majeti commented on PARQUET-474:
---------------------------------------

ORC C++ does IO in a synchronous fashion. It's framework works in a batch 
manner.
If I understand your requirement correctly, you want parquet-cpp to have 
{{synchronized}} IO API (eg: {{ReadAt()}} must be threadsafe), so that multiple 
client threads can share the same IO resource with internal state. In other 
words, you are looking at a code with a pipeline/streaming like framework. This 
JIRA makes sense for these cases.

Some applications work in a batch manner by creating multiple instances of the 
IO resource.
My concern is that introducing {{synchronized}} IO API can cause overheads for 
such applications.
For this JIRA, can you use {{if-defs}} to enable/disable the synchronization ?

> InputStream and RandomAccessdSource classes are not threadsafe
> --------------------------------------------------------------
>
>                 Key: PARQUET-474
>                 URL: https://issues.apache.org/jira/browse/PARQUET-474
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Wes McKinney
>            Assignee: Wes McKinney
>
> We need to ensure that files can be processed in multithreaded applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to