Yeah I figure this is a browser javascript limitation - anything with
access to core zip libraries on a machine should be able to implement this
fairly cheaply.
I'm surprised that browsers dont provide c++ zip/unzip apis via javascript
yet - jszip/pako etc all fall over unzipping > 500mb in my recent
investigations (and are slow)

On Fri, 18 Dec 2020 at 04:26, Jacob Quinn <[email protected]> wrote:

>  Today, I think only C++ (and libraries that bind to it) have compression
>> implemented.  I think a new PR for java was just opened in the last few
>> days.
>>
>
> Note the Julia implementation (Arrow.jl) supports compressing when writing
> and decompressing when reading. (Not that it really helps for the
> javascript side of things here, but just wanted to point it out as the
> Julia code is relatively new to the arrow project).
>
> On Thu, Dec 17, 2020 at 2:10 PM Andrew Clancy <[email protected]> wrote:
>
>> Yep - that's where I was expecting it!
>> These guys appear to implement decompression using pako:
>> https://github.com/usnistgov/jsfive - might be a good route to look
>> into.
>>
>>
>>
>> On Thu, 17 Dec 2020 at 19:19, Micah Kornfield <[email protected]>
>> wrote:
>>
>>> I don't know the support for the compression codecs in Javascript, but i
>>> don't think anyone has attempted to implement them.
>>>
>>> I couldn't find the compression feature listed on the library status
>>> docs [1].
>>>
>>> But we should add a line item for it.  Today, I think only C++ (and
>>> libraries that bind to it) have compression implemented.  I think a new PR
>>> for java was just opened in the last few days.
>>>
>>> [1] https://arrow.apache.org/docs/status.html
>>>
>>> On Thu, Dec 17, 2020 at 10:10 AM Andrew Clancy <[email protected]> wrote:
>>>
>>>> So, I figured out the issue here - I had to remove compression from the
>>>> pyarrow feather.write_feather(compression='uncompressed'). Is there
>>>> any way to read a compressed feather file in arrow js?
>>>> See the comment under the first answer here:
>>>> https://stackoverflow.com/questions/64629670/how-to-write-a-pandas-dataframe-to-arrow-file/64648955#64648955
>>>> I couldn't find anything in the arrow docs or notebooks on this - I'm
>>>> assuming that's related to javascript compression libraries being so
>>>> limited.
>>>>
>>>>
>>>> On Mon, 14 Dec 2020 at 21:32, Andrew Clancy <[email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a simple feather file created via a pandas to_feather with a
>>>>> datetime64[ns] column, and cannot get timestamps in javascript
>>>>> [email protected]
>>>>>
>>>>> See this notebook:
>>>>> https://observablehq.com/@nite/apache-arrow-timestamp-investigation
>>>>>
>>>>> I'm guessing I'm missing something, has anyone got any suggestions, or
>>>>> decent examples of reading a file created in pandas? I've seen in examples
>>>>> of [email protected] where dates stored as an array of 2 ints.
>>>>>
>>>>> File was created with:
>>>>>
>>>>> import pandas as pd
>>>>> pd.read_parquet('sample.parquet')
>>>>> df.to_feather('sample-seconds.feather')
>>>>>
>>>>> Final Q: I'm assuming this is the best place for this question?
>>>>> Happy to post elsewhere if there's any other forums, or if this should be 
>>>>> a
>>>>> JIRA ticket?
>>>>>
>>>>> Thanks!
>>>>> Andy
>>>>>
>>>>

Reply via email to