Na. It should index via the standard analyzer which splits on spaces, among 
other things (check riak handbook - great resource). The "guid:val" would index 
as a string so guid:100 should show up in a search for [guid:050 TO guid:250]. 
Try it out. 


@siculars
http://siculars.posterous.com

Sent from my iRotaryPhone

On Jul 21, 2012, at 4:17, Metin Akat <[email protected]> wrote:

> Yes, but that would require me to write a custom search analyzer to parse 
> this, upload erlang code to riak etc, right? Or is there something I don't 
> know? Please, elaborate
> 
> On Sat, Jul 21, 2012 at 11:06 AM, Alexander Sicular <[email protected]> 
> wrote:
> The overhead would be in parsing. But you could skip all that if you 
> prepended constant length data to your text. Something like :
> 
> Field:Val field:Val text
> 
> Where field and Val length are constant.
> 
> Maybe like a guid:100
> 
> Where that guid is known to you to be the file size.
> 
> 
> 
> @siculars
> http://siculars.posterous.com
> 
> Sent from my iRotaryPhone
> 
> On Jul 21, 2012, at 2:16, Metin Akat <[email protected]> wrote:
> 
>> I was thinking about this too, but as I said, these text files are sometimes 
>> quite big.  Sometimes megabytes. Rarely - tens of megabytes. They are all 
>> "write once, read quite a lot". So having them as JSON is probably going to 
>> put quite a lot of load onto riak and my application (deserialize a big 
>> chunk of JSON on every read). Of course, I might be wrong, I'll have to 
>> benchmark it probably, but I don't really feel very comfortable about it. 
>> Besides of potentially being a performance issue, it also feels quite ugly 
>> to me. Have you done this? How big files? How's the performance?
>> 
>> On Sat, Jul 21, 2012 at 7:52 AM, Alexander Sicular <[email protected]> 
>> wrote:
>> Turn your text into a json obj. Maybe something like this:
>> 
>> { size: 100
>> Name: bla
>> Date: 1/1/2012
>> Raw_txt: txt
>> }
>> 
>> 
>> @siculars
>> http://siculars.posterous.com
>> 
>> Sent from my iRotaryPhone
>> 
>> On Jul 20, 2012, at 17:49, Metin Akat <[email protected]> wrote:
>> 
>> > Hi,
>> >
>> > I am using riak to store (relatively large) text files. I store them as 
>> > normal riak objects where the value is the text of the file. Now I want to 
>> > index and search them. All is fine, I just enabled the "standard" search 
>> > pre-commit hook for that bucket and they get indexed nicely. But, there is 
>> > one tricky requirement. I need to be able to index and search some 
>> > metadata about these files. For example date of submission, size of file, 
>> > type (internal business logic) of file etc.
>> >
>> > I have been thinking quite a lot about this recently. Asked several times 
>> > on #riak. I got one answer suggesting that I create a second "metadata" 
>> > riak object for each file, link it to the "file object" and index it 
>> > separately. That's not really what I want, because I need to be able to 
>> > execute "combined" queries, like value:<some word> AND date:<some date>.
>> >
>> > So, here is the ideal solution that I'm thinking about.... It would be 
>> > great if it's possible to modify the riak search index object. After the 
>> > file is submitted, and after it's indexed, I could just fetch the index 
>> > and just add some more fields to it.
>> > I see there is a bucket with the search index objects that's automatically 
>> > created by riak search. So I guess it is indeed possible, though I don't 
>> > know what to expect. Is it a good idea? If not, what else could I do in 
>> > order to solve the problem?
>> >
>> > Regards,
>> > Metin
>> > _______________________________________________
>> > riak-users mailing list
>> > [email protected]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to