I'am comparing different languages... I have selected a simple task (http request headers converter) for this benchmark. See my implementation of this task in J, example of input and example of output in attachment.

My implementation in J converts about 2000 headers per second (on my 3GHz Pentium 4).
Perl implementation converts about 6600 headers per second
C++ implementation converts about 10000 headers per second.

Is it possible to improve my J implementation?

PS:

Here is the task's description:

The parser shall take the file with the stream of the http requests with empty 
message bodies (as specified in RFC2616). The output of the
parser shall be the file consisting of the records separated by empty-line. The 
actions of the parser in case of invalid input
are explicitely unspecified.

Each record shall be in RFC822 headers format. That is, record consists of 
attribute/value pairs. Record attributes are stored one
per line. Beginning of the line is attribute name terminated with a colon 
followed by whitespace. Attribute names do not contain
whitespace; a dash is substituted instead. The attribute value is the entire 
remainder of the line, exclusive of trailing whitespace and
newline. A physical line that begins with tab or whitespace is interpreted as a 
continuation of the current logical line. A blank
line is a record terminator.

Each record shall contain attributes named METHOD, HTTP-PROTOCOL-VERSION, 
PROTOCOL, HOST, PORT, RESOURCE and QUERY with obvious
meaning. Besides this, all message headers of the original request shall be 
presented in the record. The order of the headers in the
output is unspecified.

Example: for input file

------------------ BEGIN --------------------
GET http://somewhere:1023/fdsfsdf?fdsfd HTTP/1.1
X-TTTT: sdfsdfsdf


GET /12345 HTTP/1.1
Host: localhost


GET /x


------------------ END ----------------------

the result could be

------------------ BEGIN --------------------
METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: http
HOST: somewhere
PORT: 1023
RESOURCE: fdsfsdf
QUERY: fdsfd
X-TTTT: sdfsdfsdf

METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: http
HOST: localhost
PORT: 80
RESOURCE: 12345
QUERY:
METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/0.9
PROTOCOL: http
HOST: <your host name>
PORT: 80
RESOURCE: /x
QUERY:
------------------ END ----------------------


GET http://somewhere:1023/fdsfsdf?fdsfd HTTP/1.1
X-TTTT: sdfsdfsdf


GET https://www.site.com/asdf/qwer?query=asdf HTTP/1.1


GET /asdf/qwer?query=asdf HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-shockwave-flash, application/vnd.ms-excel, application/msword, 
application/vnd.ms-powerpoint, application/x-icq, */*
Accept-Language: ru
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 
1.1.4322)
Host: ws:81
Connection: Keep-Alive


GET /asdf/qwer?query=asdf HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-shockwave-flash, application/vnd.ms-excel, application/msword, 
application/vnd.ms-powerpoint, application/x-icq, */*
Accept-Language: ru
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 
1.1.4322)
Host: ws:81
Connection: Keep-Alive


#!/bin/j

require 'jpm socket strings'

IFNAME =: 'headers'
OFNAME =: 'records'

HCNT =: 0
LHN =: , 1{::sdgethostname_jsocket_''

NB. ================== parse ====================

PPs =: 50 3 $ 'pp0'
PPs =: ('pp1')(7 8 9 27 28 29)}PPs
PPs =: ('pp2')(13 14)}PPs
PPs =: ('pp3')(15 18 19)}PPs
PPs =: ('pp4')(17)}PPs
PPs =: ('pp5')(38 39)}PPs
PPs =: ('pp6')(49)}PPs

pp0 =: monad : ''''''
pp1 =: monad : '''''[AVs =: (<y.)(3)}AVs'
pp2 =: monad : 'pp1 _1}.y.'
pp3 =: monad : 'pp1 _2}.y.'
pp4 =: monad : '''''[AVs =: (<_2}.y.)(2)}AVs'
pp5 =: monad : '''''[AVs =: (<y.)(5)}AVs'
pp6 =: monad : '''''[AVs =: (<y.)(6)}AVs'

pusm =: 10 5 2 $ 1 1 5 1 6 0 8 0 0 0 1 0 2 0 6 3 8 3 0 3 5 0 0 6 3 0 8 3 0 3 7 
2 0 6 4 3 8 3 0 3 5 1 5 1 6 1 8 0 0 0 5 0 5 0 6 3 8 3 0 3 7 1 0 6 6 0 8 0 0 0 7 
0 0 6 7 0 8 3 0 3 9 1 9 1 9 1 9 1 0 0 9 0 9 0 9 0 9 0 0 3
process_url =: monad define
    y. =. y.,LF
    t =. (4;pusm;<(a.-.':/? ',TAB,LF);':';'/';'?';' ',TAB,LF) ;: y.

    for_j. t do.
        'i l f' =. j
        (f{PPs)128!:2(l{.i}.y.)
    end.

    ''return.
)

process_method =: monad define
    ANs =: 
'METHOD:';'HTTP-PROTOCOL-VERSION:';'PROTOCOL:';'HOST:';'PORT:';'RESOURCE:';'QUERY:'
    AVs =: '';'HTTP/0.9';'http';LHN;'80';'';''

    t =. (<;._2)y.,' '
    if. 2=#t do. t =. t,<'HTTP/0.9' end.
    if. 3=#t do.
        'm u v' =. t
        AVs =: (m;v)(0 1)}AVs
        process_url u
    end.

    AVs =: (' '&,@:,&LF)each AVs
    
    ''return.
)

pasm =: 5 4 2 $ 1 1 0 6 0 6 0 6 1 0 2 3 0 6 0 6 3 1 3 1 3 1 3 1 3 0 3 0 4 0 3 0 
1 2 0 3 0 6 3 0
process_attrs =: monad define
    if. #y. do.
        t =. (0;pasm;<(a.-.': ',LF,TAB);':';LF;' ',TAB);:y.
        t =. |:((2%~#t),2)$t
        n =. (toupper@,&':') each 0{t
        v =. 1{t
        i =. ANs i.n
        u =. i=#ANs
        k =. -.u
        AVs =: (k#v)(k#i)}AVs
        ANs =: ANs,u#n
        AVs =: AVs,u#v
    end.

    ''return.
)

postprocess_host =: monad define
    v =. 3{::AVs
    i =. v i.':'
    if. i<#v do.
        h =. deb i{.v
        if. 0=#h do. h =. LHN end.
        p =. ((>:i)}.v)-.LF
        if. 0=#p do. p =. '80' end.
        AVs =: ((' ',h,LF);(' ',p,LF))(3 4)}AVs
    end.
    ''return.
)

save_result =: monad : 'TEXT =: TEXT,(;ANs,.AVs),LF'

process_header =: monad define
    y. =. 2}.y.,LF
    HCNT =: >:HCNT

    i =. y.i.LF
    process_method i{.y.
    process_attrs (>:i)}.y.

    postprocess_host''
    save_result''
    
    ''return.
)

NB. ================== read =====================

main =: monad define
    TEXT =: ''

    fn =. IFNAME
    fs =. 1!:4<fn
    fp =. 0
    fss =. 1e7
    buf =. LF,LF
    '' 1!:2 <OFNAME
    while. 1 do.
        if. fp<fs do.
            NB. read chunk
            n =. fss<.fs-fp
            t =. (buf, 1!:11 fn;fp,n)-.CR
            fp =. fp+n

            NB. cut headers
            m =. (LF,LF,LF)E.t
            m process_header;._2 t

            NB. member tail
            i =. m i:1
            buf =. (>:i)}.t
        else.
            if. 3<#buf do. process_header <buf end.
            break.
        end.
    end.

    (TEXT,LF) 1!:3 <OFNAME
    
    ''return.
)

NB.start_jpm_''
main''
NB.echo (0 0 100 showtotal_jpm_'')
NB.echo showdetail_jpm_ 'process_url'

echo HCNT

exit''
METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: http
HOST: somewhere
PORT: 1023
RESOURCE: fdsfsdf
QUERY: fdsfd
X-TTTT: sdfsdfsdf

METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: https
HOST: www.site.com
PORT: 80
RESOURCE: asdf/qwer
QUERY: query=asdf

METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: http
HOST: ws
PORT: 81
RESOURCE: asdf/qwer
QUERY: query=asdf
ACCEPT: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-shockwave-flash, application/vnd.ms-excel, application/msword, 
application/vnd.ms-powerpoint, application/x-icq, */*
ACCEPT-LANGUAGE: ru
ACCEPT-ENCODING: gzip, deflate
USER-AGENT: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 
1.1.4322)
CONNECTION: Keep-Alive

METHOD: GET
HTTP-PROTOCOL-VERSION: HTTP/1.1
PROTOCOL: http
HOST: ws
PORT: 81
RESOURCE: asdf/qwer
QUERY: query=asdf
ACCEPT: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-shockwave-flash, application/vnd.ms-excel, application/msword, 
application/vnd.ms-powerpoint, application/x-icq, */*
ACCEPT-LANGUAGE: ru
ACCEPT-ENCODING: gzip, deflate
USER-AGENT: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 
1.1.4322)
CONNECTION: Keep-Alive


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to