On 1/2/18 1:01 PM, Christian Köstlin wrote:
On 02.01.18 15:09, Steven Schveighoffer wrote:
On 1/2/18 8:57 AM, Adam D. Ruppe wrote:
On Tuesday, 2 January 2018 at 11:22:06 UTC, Stefan Koch wrote:
You can make it much faster by using a sliced static array as buffer.

Only if you want data corruption! It keeps a copy of your pointer
internally: https://github.com/dlang/phobos/blob/master/std/zlib.d#L605

It also will always overallocate new buffers on each call
<https://github.com/dlang/phobos/blob/master/std/zlib.d#L602>

There is no efficient way to use it. The implementation is substandard
because the API limits the design.

iopipe handles this quite well. And deals with the buffers properly
(yes, it is very tricky. You have to ref-count the zstream structure,
because it keeps internal pointers to *itself* as well!). And no, iopipe
doesn't use std.zlib, I use the etc.zlib functions (but I poached some
ideas from std.zlib when writing it).

https://github.com/schveiguy/iopipe/blob/master/source/iopipe/zip.d

I even wrote a json parser for iopipe. But it's far from complete. And
probably needs updating since I changed some of the iopipe API.

https://github.com/schveiguy/jsoniopipe

Depending on the use case, it might be enough, and should be very fast.

Thanks Steve for this proposal (actually I already had an iopipe version
on my harddisk that I applied to this problem) Its more or less your
unzip example + putting the data to an appender (I hope this is how it
should be done, to get the data to RAM).

Well, you don't need to use appender for that (and doing so is copying a lot of the data an extra time). All you need is to extend the pipe until there isn't any more new data, and it will all be in the buffer.

// almost the same line from your current version
auto mypipe = openDev("../out/nist/2011.json.gz")
                  .bufd.unzip(CompressionFormat.gzip);

// This line here will work with the current release (0.0.2):
while(mypipe.extend(0) != 0) {}

//But I have a fix for a bug that hasn't been released yet, this would work if you use iopipe-master:
mypipe.ensureElems();

// getting the data is as simple as looking at the buffer.
auto data = mypipe.window; // ubyte[] of the data

iopipe is already better than the normal dlang version, almost like
java, but still far from the solution. I updated
https://github.com/gizmomogwai/benchmarks/tree/master/gunzip

I will give the direct gunzip calls a try ...

In terms of json parsing, I had really nice results with the fast.json
pull parser, but its comparing a little bit apples with oranges, because
I did not pull out all the data there.

Yeah, with jsoniopipe being very raw, I wouldn't be sure it was usable in your case. The end goal is to have something fast, but very easy to construct. I wasn't planning on focusing on the speed (yet) like other libraries do, but ease of writing code to use it.

-Steve
  • Help optimizing UnCompress fo... Christian Köstlin via Digitalmars-d-learn
    • Re: Help optimizing UnCo... Stefan Koch via Digitalmars-d-learn
      • Re: Help optimizing ... Adam D. Ruppe via Digitalmars-d-learn
        • Re: Help optimiz... Steven Schveighoffer via Digitalmars-d-learn
          • Re: Help opt... Christian Köstlin via Digitalmars-d-learn
            • Re: Hel... Steven Schveighoffer via Digitalmars-d-learn
              • Re:... Steven Schveighoffer via Digitalmars-d-learn
                • ... Christian Köstlin via Digitalmars-d-learn
              • Re:... Christian Köstlin via Digitalmars-d-learn
                • ... Steven Schveighoffer via Digitalmars-d-learn
                • ... Steven Schveighoffer via Digitalmars-d-learn
                • ... Steven Schveighoffer via Digitalmars-d-learn
                • ... Christian Köstlin via Digitalmars-d-learn
                • ... Christian Köstlin via Digitalmars-d-learn
                • ... Steven Schveighoffer via Digitalmars-d-learn
                • ... Steven Schveighoffer via Digitalmars-d-learn

Reply via email to