Hi Parquet team,
It is very exciting to see this effort. Thanks Micah for starting this.
For most use case that our team sees the broad areas for improvement
appear to be -
1) Optimizing for cloud storage (latency is high, seeks are expensive)
2) Optimized metadata reading - we've seen
Hi folks, how can I find information about how to join?
Thank you Andrew!
On Mon, May 20, 2024 at 7:05 AM Andrew Lamb wrote:
> Here is the infrastructure ticket with the request to rename the
> repository: https://issues.apache.org/jira/browse/INFRA-25802
>
> On Fri, May 17, 2024 at 1:28 PM Prem Sahoo wrote:
>
> > +1 as it will be apt name .
> >
Here is the infrastructure ticket with the request to rename the
repository: https://issues.apache.org/jira/browse/INFRA-25802
On Fri, May 17, 2024 at 1:28 PM Prem Sahoo wrote:
> +1 as it will be apt name .
> Sent from my iPhone
>
> > On May 17, 2024, at 12:32 PM, Daniel Weeks wrote:
> >
> >
I have filed an issue[1] with this request
[1] https://issues.apache.org/jira/browse/INFRA-25801
On Wed, May 15, 2024 at 6:54 PM Julien Le Dem wrote:
> +1
>
> On Wed, May 15, 2024 at 4:15 AM Andrew Lamb
> wrote:
>
> > I plan to wait until next week to allow any one else who has an opinion
>
Hello all,
I work in environments where both usages exist. The single file approach at
leat in this setting comes from the fact that a lot of input data for ML
pipelines has been historically a single CSV fike dump. As also a lot of data
analysis tools have been single-threaded, people are