On Mon, Oct 25, 2010 at 07:34:13PM -0700, Dan Reverri wrote:
> I don't understand your use case; can you expand on what you are doing?
Perhaps this in the category of "too much information," but since you asked, I
had written an OAI-PMH provider in erlang. My backing store is a dets file,
which I load into memory and read into an ets table when the server starts up.
The memory-resident database consists of keys associated with values which are
erlang terms. I use these terms for filtering queries conforming to the OAI-PMH
protocol specification. The way I create the dets file is to begin with a file
of plain text. Suppose I had one term in it (I'll typically have hundreds or
thousands), it might look like this:
{oai_dc,"AEP-WYS97",
{{2004,10,28},{17,21,25}},
["aep"],
"/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}.
In erlang itself, if I've called the file AEP-WYS97, I can do something like
this:
1> {_, Term} = file:consult("AEP-WYS97").
{ok,[{oai_dc,"AEP-WYS97",
{{2004,10,28},{17,21,25}},
["aep"],
"/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]}
2> Term.
[{oai_dc,"AEP-WYS97",
{{2004,10,28},{17,21,25}},
["aep"],
"/storage/aep2003/metadata/oai_dc/oai_dc-AEP-WYS97.xml"}]
3> element(2, hd(Term)).
"AEP-WYS97"
Now, if I replace dets/ets with riak, I can do two things (I think). The first
would be to write an erlang function that does something like the above. But
the second (which is where my question comes into play) is, if I can load
arbitrary data types into riak, then is there a way I can specify that what I'm
loading are erlang terms. If I don't do that, and load strings, I run into this
sort of nastiness:
([email protected])295> {ok, Tk} = C:get(<<"oai_dc">>, <<"dsalhensley-m025">>).
([email protected])296> V = riak_object:get_value(Tk).
<<"{oai_dc,dsalhensley-m025, {{2004,11,9},{15,38,27}},
[dsal,dsal:hensley], /storage/dsal/hensley/"...>>
([email protected])300> A = binary_to_list(V).
"{oai_dc,dsalhensley-m025, {{2004,11,9},{15,38,27}},
[dsal,dsal:hensley],
/storage/dsal/hensley/metadata/oai_dc/oai_dc-dsalhensley-m025.xml}"
([email protected])301> {ok, Tokens, _} = erl_scan:string(A).
{ok,[{'{',1},
{atom,1,oai_dc},
{',',1},
{atom,1,dsalhensley},
{'-',1},
{atom,1,m025},
{',',1},
{'{',1},
{'{',1},
{integer,1,2004},
{',',1},
{integer,1,11},
{',',1},
{integer,1,9},
{'}',1},
{',',1},
{'{',1},
{integer,1,15},
{',',1},
{integer,1,38},
{',',1},
{integer,1,27},
{'}',1},
{'}',1},
{',',1},
{'[',...},
{...}|...],
1}
So far, so good. However:
([email protected])302> erl_parse:parse_term(Tokens).
{error,{1,erl_parse,["syntax error before: ","'/'"]}}
This happens because I have unescaped quotation marks in the input. If I had
intended my input to be strings, then that's something I should have taken care
of beforehand, but if I intend the input to be terms, then I shouldn't have to.
I'm prepared to use option 1, above, or else convert my terms into strings, and
parse strings, but string handling in erlang is not particularly efficient, so
before I go either of those routes, I'm curious whether I can tell riak, this
is an erlang term, have it be stored as such, so when I go to use it, my code
doesn't have to do any further transformation that isn't related to the job at
hand (filtering).
I'm relatively new to riak (less than a week), so perhaps there is a better
approach entirely.
Thanks.
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com