RE: Is there a way to round data when index, but still able to return original content?

2012-12-10 Thread Swati Swoboda
When you apply your analyzers/filters/tokenizers, the result value is kept in 
the indexed; however, the input value is actually stored. For example, from 
schema.xml file:

fieldType name=text class=solr.TextField positionIncrementGap=100
  analyzer
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

This particular field type will strip out the HTML. So if the input is:

bHello/b

It's being tokenized in the index as 

Hello

It's being stored (and hence returned to you) as

bHello/b

So you can create your own charFilter or filter class which converts your date 
for the indexer, but the original data will automatically be stored.

I hope this makes sense.

-Original Message-
From: jefferyyuan [mailto:yuanyun...@gmail.com] 
Sent: Monday, December 10, 2012 10:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Is there a way to round data when index, but still able to return 
original content?

Erick, Thanks for your reply.

I know how to implement the solution 1.

But no idea how yo implement the solution 2 you mentioned:
===
If you put some sort of (perhaps custom) filter in place, then the original 
value would go in as stored and the altered value would get in the index and 
you could do both in the same field. 

Can you please describe more about how to store original data and index the 
altered value in the same filed?

Thanks :)







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025695.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Is there a way to round data when index, but still able to return original content?

2012-12-10 Thread jefferyyuan
Sorry to ask a question again, but I want to round date(TireDate) and
TrieLongField, seems they don't support configuring analyzer: charFilter ,
tokenizer or filter.

What I should do? Now I am thinking to write my custom date or long field,
is there any other way? :)

Thanks :)
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025793.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Is there a way to round data when index, but still able to return original content?

2012-12-10 Thread Swati Swoboda
Hi,

Nope...they don't. Generally, I am not sure if I'd bother rounding this 
information to reduce the index size. Have you determined how much index size 
space you'll actually be saving? I am not confident that it'd be worth your 
time; i.e. I'd just go with indexing/storing the time information as well. 

Regardless, if you do want to go this route, the only way I can think of that 
wouldn't be a complicated solution is to have one field that is 
indexed/rounded (and not stored) and another field that is just stored (and not 
indexed).

Hope this helps.

-Original Message-
From: jefferyyuan [mailto:yuanyun...@gmail.com] 
Sent: Monday, December 10, 2012 3:14 PM
To: solr-user@lucene.apache.org
Subject: RE: Is there a way to round data when index, but still able to return 
original content?

Sorry to ask a question again, but I want to round date(TireDate) and 
TrieLongField, seems they don't support configuring analyzer: charFilter , 
tokenizer or filter.

What I should do? Now I am thinking to write my custom date or long field, is 
there any other way? :)

Thanks :)
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405p4025793.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to round data when index, but still able to return original content?

2012-12-08 Thread Erick Erickson
Depends on whether the transformation is before or after the doc gets sent
to Solr. If you're changing the data before you give it to Solr, then you'd
have to have two fields, probably indexed=true and stored=false for the one
you search on, and indexed=false stored=true for the one you return to the
user.

This really doesn't take any more resources than using one field.

If you put some sort of (perhaps custom) filter in place, then the original
value would go in as stored and the altered value would get in the index
and you could do both in the same field.

Best
Erick


On Sat, Dec 8, 2012 at 2:34 PM, jefferyyuan yuanyun...@gmail.com wrote:

 Hi:

 I am wondering whether there is a way to round data when index, but still
 able to return original content?

 For example, for a date field: 2012-12-21T12:12:12Z, because when search,
 user only cares date part, so I can round it to 2012-12-12T00:00:00Z, when
 index it - this can reduce index size, as there will be less term.

 But user still wants to get the original content, so the result of matched
 doc will return 2012-12-21T12:12:12Z not 2012-12-12T00:00:00Z.

 This also applies to number and text field.

 Is there a way to do this in Solr?

 Thanks for you reply :)





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Is-there-a-way-to-round-data-when-index-but-still-able-to-return-original-content-tp4025405.html
 Sent from the Solr - User mailing list archive at Nabble.com.