[
https://issues.apache.org/jira/browse/PARQUET-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339582#comment-15339582
]
Deepak Majeti commented on PARQUET-474:
---------------------------------------
ORC C++ does IO in a synchronous fashion. It's framework works in a batch
manner.
If I understand your requirement correctly, you want parquet-cpp to have
{{synchronized}} IO API (eg: {{ReadAt()}} must be threadsafe), so that multiple
client threads can share the same IO resource with internal state. In other
words, you are looking at a code with a pipeline/streaming like framework. This
JIRA makes sense for these cases.
Some applications work in a batch manner by creating multiple instances of the
IO resource.
My concern is that introducing {{synchronized}} IO API can cause overheads for
such applications.
For this JIRA, can you use {{if-defs}} to enable/disable the synchronization ?
> InputStream and RandomAccessdSource classes are not threadsafe
> --------------------------------------------------------------
>
> Key: PARQUET-474
> URL: https://issues.apache.org/jira/browse/PARQUET-474
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: Wes McKinney
> Assignee: Wes McKinney
>
> We need to ensure that files can be processed in multithreaded applications
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)