Re: Httparsed - fast native dlang HTTP 1.x message header parser

2022-05-28 Thread tchaloupka via Digitalmars-d-announce

On Saturday, 28 May 2022 at 05:37:06 UTC, test123 wrote:

Maybe we can add the picohttpparser test case into httparsed.


Hi, it is actually 
[there](https://github.com/tchaloupka/httparsed/blob/e07906e61b7c0b5123ecec4ea6a578b1768c47da/source/httparsed.d#L669), probably not exactly everything, but mostly is.


I try use this project, but it has error results when parse 
some header files.


I has to add this into nginx to avoid the error:


```sh
  proxy_set_header Sec-Fetch-User "";
  proxy_set_header Sec-Ch-Ua-Mobile "";
  proxy_set_header Sec-Ch-Ua "";
```


Please file the issue directly to the 
[repository](https://github.com/tchaloupka/httparsed) ideally 
with a problematic real header so I can add it to the tests and 
see whats wrong.


As this doesn't belong to the announce forum much ;-)

Thx


Httparsed - fast native dlang HTTP 1.x message header parser

2022-05-27 Thread test123 via Digitalmars-d-announce

https://forum.dlang.org/post/odlataafslwqvsgsm...@forum.dlang.org

On Monday, 14 December 2020 at 21:59:02 UTC, tchaloupka wrote:

Hi,
I was missing some commonly usable HTTP parser on 
code.dlang.org and after some research and work I've published 
httparsed[1].


[1] https://code.dlang.org/packages/httparsed
[2] https://github.com/h2o/picohttpparser
[3] https://i.imgur.com/iRCDGVo.png
[4] https://github.com/nodejs/http-parser
[5] 
https://github.com/adamdruppe/arsd/blob/402ea062b81197410b05df7f75c299e5e3eef0d8/cgi.d#L1737
[6] 
https://github.com/tchaloupka/httparsed/blob/230ba9a4a280ba91267a22e97137be12269b5574/bench/bench.d#L194


Thanks for the great work.

Maybe we can add the picohttpparser test case into httparsed.

I try use this project, but it has error results when parse some 
header files.


I has to add this into nginx to avoid the error:


```sh
  proxy_set_header Sec-Fetch-User "";
  proxy_set_header Sec-Ch-Ua-Mobile "";
  proxy_set_header Sec-Ch-Ua "";
```






Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-15 Thread Jacob Carlborg via Digitalmars-d-announce

On 2020-12-14 22:59, tchaloupka wrote:

Hi,
I was missing some commonly usable HTTP parser on code.dlang.org and 
after some research and work I've published httparsed[1].


This is awesome. I wanted to use picohttpparser myself and used the C 
version. But if you already have created a HTTP parser with the same 
properties in D, that's even better.



--
/Jacob Carlborg


Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-15 Thread Adam D. Ruppe via Digitalmars-d-announce

On Tuesday, 15 December 2020 at 10:04:42 UTC, tchaloupka wrote:
But if these benchmarks helps Adam to make some incremental 
improvements it's a plus and many of that can be pretty low 
hanging fruit.


Yeah, I think the biggest benefit to changing this around is to 
just avoid creating unnecessary garbage.


On the individual item, it doesn't really matter, but it can 
build up to a totally wasted collection cycle as time goes on. 
Just on the other hand, in any non-trivial real world application 
there's likely to be some garbage generated anyway and this will 
disappear into the noise.


Though in the hello world benches it could bring the "max" column 
down since I'm p sure that is caused by a GC cycle and hello 
world can potentially avoid having even one :P


That means that with a performant parser, arsd could go up to 
around 27548 RPS -> not much of a difference that would be 
worth the hassle..


Yeah, that one is basically entirely the result of the thread 
work queue. If everything else was perfect, the thread stuff 
would still dominate. (My evidence for this is the hybrid and 
process dispatchers doing pretty consistently better. The thread 
one though is simple and cross-platform which is nice - like 
without it, that Mac version probably wouldn't have worked at all 
since I've written no mac-specific code in this module.)


Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-15 Thread tchaloupka via Digitalmars-d-announce

On Tuesday, 15 December 2020 at 00:32:42 UTC, H. S. Teoh wrote:

For that alone, I think Adam deserves a salute.

(But of course, if Adam improves cgi.d to be competitive with 
vibe.d,

then it could totally rock the D world! ;-))
T


Yes absolutely, arsd has a bit different usecase and target 
audience, no one should expect it to beat top 10 of highly 
optimized frameworks in techempower benchmark ;-)


But if these benchmarks helps Adam to make some incremental 
improvements it's a plus and many of that can be pretty low 
hanging fruit.


If I take one number of arsd from the httpbench - 27469 RPS
It means 36.4us per request.
In http parser test it is about 2.4us per request, while 
httparsed is about 0.1us per request.


That means that with a performant parser, arsd could go up to 
around 27548 RPS -> not much of a difference that would be worth 
the hassle..


Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-14 Thread Adam D. Ruppe via Digitalmars-d-announce

On Tuesday, 15 December 2020 at 00:32:42 UTC, H. S. Teoh wrote:

It may not be the fastest web module in the D world


It actually does quite well, see: 
https://github.com/tchaloupka/httpbench (from the same OP here :) 
)


The header parser is nothing special, but since header parsing is 
a small part of the overall problem, it is good enough.


Though I have been tempted to optimize it a bit more since in a 
hello world benchmark even a small thing like header parsing can 
be noticeable. The fact that it does some totally unnecessary GC 
allocations can perhaps add up too.


(If I was doing all this again from scratch I'd actually be 
tempted to do a zero-copy, all lazy version. Read from the socket 
directly into the request-local buffer, then slice into it while 
parsing, then do decoding on-demand in that same buffer - url 
encoding always takes more space than the decoded version - and 
the result should be basically the fastest thing you can get. And 
if something comes in above typical size, then it can go back to 
the normal reallocated buffer and still win big on the average 
request. The problem with doing that now would be maintaining 
compatibility with my existing API.)


(But of course, if Adam improves cgi.d to be competitive with 
vibe.d


My biggest deficit compared to vibe is prolly documentation. 
Especially of my advanced features which are practically hidden.


Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-14 Thread H. S. Teoh via Digitalmars-d-announce
On Tue, Dec 15, 2020 at 12:11:44AM +, Adam D. Ruppe via 
Digitalmars-d-announce wrote:
> On Monday, 14 December 2020 at 21:59:02 UTC, tchaloupka wrote:
> > * arsd's cgi.d - I haven't expected it to be so much slower than
> > vibe-d parser, it's almost 3 times slower, but on the other hand
> > it's super simple idiomatic D (again doesn't check or allow what RFC
> > says it should and many tests will fail)
> 
> yeah, I think I actually wrote that about eight years ago and then
> never revisited it actually git blame says "committed on Mar 24,
> 2012" so almost nine! And indeed, that git blame shows the bulk of it
> is still the initial commit, though a few `toLower`s got changed to
> `asLowerCase` a few years ago... so it used to be even worse! lol

Slow or not, cgi.d is totally awesome in my book, because recently it
saved my life.  While helping out someone, I threw together a little D
script to do what he wanted; only, I run Linux and he runs a Mac, and my
script is CLI-only while he's a non-poweruser and has no idea what to do
at the command prompt.  So naturally my thought was, let's give this a
web interface so that there's a fighting chance non-programmers would
know how to use it.  Being a program I wrote in literally 4 hours
(possibly less), I wasn't going to let it turn into a monster full of
hundreds of 3rd party dependencies, so I reached for my trusty solution:
arsd's cgi.d.

Just a single file, no network dependencies, no complicated builds, just
drop the file into my code, import it, and off I go.  Better yet, it
came with a built-in CLI request tester: perfect for local testing
without the hassle of needing to start/stop an entire web service just
to run a quick test; plus a compile-time switch to adapt it to any
common webserver interface you like: CGI, FastCGI, even standalone HTTP
server.  Problem solved in a couple o' hours, as opposed to who knows
how long it would have taken to engineer a "real" solution with vibe.d
or one of the other heavyweight "frameworks" out there.

It may not be the fastest web module in the D world, but it's certainly
danged convenient, does the necessary job with a minimum of fuss, easily
adaptable to a variety of common use cases, and best of all, requires
basically no dependencies beyond just dropping the file into your code.

For that alone, I think Adam deserves a salute.

(But of course, if Adam improves cgi.d to be competitive with vibe.d,
then it could totally rock the D world! ;-))


T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.


Re: Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-14 Thread Adam D. Ruppe via Digitalmars-d-announce

On Monday, 14 December 2020 at 21:59:02 UTC, tchaloupka wrote:
* arsd's cgi.d - I haven't expected it to be so much slower 
than vibe-d parser, it's almost 3 times slower, but on the 
other hand it's super simple idiomatic D (again doesn't check 
or allow what RFC says it should and many tests will fail)


yeah, I think I actually wrote that about eight years ago and 
then never revisited it actually git blame says "committed on 
Mar 24, 2012" so almost nine! And indeed, that git blame shows 
the bulk of it is still the initial commit, though a few 
`toLower`s got changed to `asLowerCase` a few years ago... so it 
used to be even worse! lol


But wanna see something that will make you cry?

https://github.com/adamdruppe/arsd/blob/master/http2.d#L1232

I have another http header parser!!! That's for my client, and as 
you can see, it is... not great. The case-insensitivity for 
example is a mega hack and I actually need to fix that eventually.


At least there's some support for line continuations there. I 
don't remember if I ever actually tested that though, it seems 
most clients and servers don't do that anyway.


Httparsed - fast native dlang HTTP 1.x message header parser

2020-12-14 Thread tchaloupka via Digitalmars-d-announce

Hi,
I was missing some commonly usable HTTP parser on code.dlang.org 
and after some research and work I've published httparsed[1].


It's inspired by picohttpparser[2] which is great, but instead of 
a binding, I wanted something native to D. Go has it's own 
parsers, Rust has it's own parsers, why not D?


I think we're missing other small libraries like this on the 
code.dlang.org to be commonly used in larger ones like it's so 
common in other languages - while improving the ecosystem. Vibe-d 
is just huuuge.


It is nothrow, @nogc and can work with betterC. It just parses 
the message header and calls provided callbacks with slices to 
the original buffer to be handled as needed by the caller.


Same as picohttpparser it uses SSE4.2 `_mm_cmpestri` instruction 
to speedup the invalid characters lookup (when built with ldc2 
and target that supports it).


It has pretty thorough test suite.
Can parse incomplete message headers.
Can continue parsing from the last completely parsed line.
Doesn't enforce method or protocol version on itself to be usable 
with other internet message like protocols as is for example RTSP.


Performance wise it's pretty on par with picohttpparser [3]. 
Without SSE4.2 it's a bit faster, with SSE4.2 it's a bit slower 
and I can't figure out why :/.

But overall, I'm pretty happy with the outcome.

I've tried to check and compare with two popular libraries and:

* vibe-d - performs nearly the same as http_parser[4] (but that 
itself is pretty slow and now obsolete), but as it looks, doesn't 
do much in regard of RFC conformance - some test's from [2] won't 
pass for sure


* arsd's cgi.d - I haven't expected it to be so much slower than 
vibe-d parser, it's almost 3 times slower, but on the other hand 
it's super simple idiomatic D (again doesn't check or allow what 
RFC says it should and many tests will fail)
  * I guess the main problem would be `idup` on every line and 
autodecode
  * Stripped down minimalistic version of the original [5] is 
here [6]


[1] https://code.dlang.org/packages/httparsed
[2] https://github.com/h2o/picohttpparser
[3] https://i.imgur.com/iRCDGVo.png
[4] https://github.com/nodejs/http-parser
[5] 
https://github.com/adamdruppe/arsd/blob/402ea062b81197410b05df7f75c299e5e3eef0d8/cgi.d#L1737
[6] 
https://github.com/tchaloupka/httparsed/blob/230ba9a4a280ba91267a22e97137be12269b5574/bench/bench.d#L194