On Mon, Oct 25, 2010 at 07:34:13PM -0700, Dan Reverri wrote:
> I don't understand your use case; can you expand on what you are doing?

Perhaps this in the category of "too much information," but since you asked, I 
had written an OAI-PMH provider in erlang. My backing store is a dets file, 
which I load into memory and read into an ets table when the server starts up. 
The memory-resident database consists of keys associated with values which are 
erlang terms. I use these terms for filtering queries conforming to the OAI-PMH 
protocol specification. The way I create the dets file is to begin with a file 
of plain text. Suppose I had one term in it (I'll typically have hundreds or 
thousands), it might look like this:

{oai_dc,"AEP-WYS97",                                                            
           
        {{2004,10,28},{17,21,25}},
        ["aep"],                                                                
           
        "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}.

In erlang itself, if I've called the file AEP-WYS97, I can do something like 
this:

1> {_, Term} = file:consult("AEP-WYS97").
{ok,[{oai_dc,"AEP-WYS97",
             {{2004,10,28},{17,21,25}},
             ["aep"],
             "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]}

2> Term.
[{oai_dc,"AEP-WYS97",
         {{2004,10,28},{17,21,25}},
         ["aep"],
         "/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]

3> element(2, hd(Term)).
"AEP-WYS97"

Now, if I replace dets/ets with riak, I can do two things (I think). The first 
would be to write an erlang function that does something like the above. But 
the second (which is where my question comes into play) is, if I can load 
arbitrary data types into riak, then is there a way I can specify that what I'm 
loading are erlang terms. If I don't do that, and load strings, I run into this 
sort of nastiness:

([email protected])295> {ok, Tk} = C:get(<<"oai_dc">>, <<"dsalhensley-m025">>).

([email protected])296> V = riak_object:get_value(Tk).                         
<<"{oai_dc,dsalhensley-m025,        {{2004,11,9},{15,38,27}},        
[dsal,dsal:hensley],        /storage/dsal/hensley/"...>>

([email protected])300> A = binary_to_list(V).
"{oai_dc,dsalhensley-m025,        {{2004,11,9},{15,38,27}},        
[dsal,dsal:hensley],        
/storage/dsal/hensley/metadata/oai_dc/oai_dc-dsalhensley-m025.xml}"

([email protected])301> {ok, Tokens, _} = erl_scan:string(A).
{ok,[{'{',1},
     {atom,1,oai_dc},
     {',',1},
     {atom,1,dsalhensley},
     {'-',1},
     {atom,1,m025},
     {',',1},
     {'{',1},
     {'{',1},
     {integer,1,2004},
     {',',1},
     {integer,1,11},
     {',',1},
     {integer,1,9},
     {'}',1},
     {',',1},
     {'{',1},
     {integer,1,15},
     {',',1},
     {integer,1,38},
     {',',1},
     {integer,1,27},
     {'}',1},
     {'}',1},
     {',',1},
     {'[',...},
     {...}|...],
    1}

So far, so good. However:

([email protected])302> erl_parse:parse_term(Tokens).                          
{error,{1,erl_parse,["syntax error before: ","'/'"]}}

This happens because I have unescaped quotation marks in the input. If I had 
intended my input to be strings, then that's something I should have taken care 
of beforehand, but if I intend the input to be terms, then I shouldn't have to.

I'm prepared to use option 1, above, or else convert my terms into strings, and 
parse strings, but string handling in erlang is not particularly efficient, so 
before I go either of those routes, I'm curious whether I can tell riak, this 
is an erlang term, have it be stored as such, so when I go to use it, my code 
doesn't have to do any further transformation that isn't related to the job at 
hand (filtering).

I'm relatively new to riak (less than a week), so perhaps there is a better 
approach entirely.

Thanks.





_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to