Hi there,

I wonder if Riak is a good way to do what I want. And I would ask her if my
design is possible.

The goal is to be able to get the access sum on a given address againts a
given period in the logs of my web server.

Each days I get the log file of the day before ( 4249421 lines for 1 server
).

compute it to create a sum of line number for each URL and create a file
with json insertable lines (about 200000 each days for a platform)  looks
like

    { "Count": 15, "Date":"20100813", "URL":"/home", "Method":"GET" }

and inject them in my bucket 'PlatformLog' with random key

Does it sounds good/weird ?

my example of map/reduce (REST):

{
    "inputs": "PlatformLog",
    "query": [
        {
            "map": {
                "language": "javascript",
                "source": "function(v,k,a) { var
data=JSON.parse(Riak.mapValuesJson(v)[0]); return (data.Time > a[1] &&
data.Time > a[2] && data.URL == a[0]) ? [data.Count] : []  ; }",
                "keep": false,
                "arg": [ "/home", "20100801", "20100901" ]
            }
        },
        {
            "reduce": {
                "language": "javascript",
                "name": "Riak.reduceSum"
            }
        }
    ],
}

I suppose a great enhancment would be keys based on date to make some more
filter in the input fields.
Can I insert a whole object table of 200000 items as
http://server/PlatformLog/20100813 ?

How can I manage selection of keys ? if one does not exists in my range of
date for example.

Thank you for your advises.

Best regards,

-- 
Damien
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to