Re: Unit of dimension for solr field

2013-11-12 Thread eakarsu
Erick,

I haven't written any SOLR plugin before so it takes time to understand
concepts.

This is more simpler to implement and I think this way does not need to
write any plugin SOLR, isn't it?
Outside process analyses values with dimensions and prepare 2 fields as you
described

Erol Akarsu



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100449.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unit of dimension for solr field

2013-11-12 Thread Erick Erickson
Yep, doing this outside Solr at ingestion should be a simpler model if
you already have an external ingestion method. Otherwise a custom
update processor would be reasonably easy.

Best,
Erick


On Tue, Nov 12, 2013 at 8:04 AM, eakarsu eaka...@gmail.com wrote:

 Erick,

 I haven't written any SOLR plugin before so it takes time to understand
 concepts.

 This is more simpler to implement and I think this way does not need to
 write any plugin SOLR, isn't it?
 Outside process analyses values with dimensions and prepare 2 fields as you
 described

 Erol Akarsu



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100449.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Thanks Upayavira 

It seems it needs too much work. I will have several more fields that will
have unit values.
Do we have more quicker way of implementing it?

We have Currency filed coming as default with SOLR. Can we use it?
Creating conversion rate table for each field? What I am expecting from
units is similar to currency field

Erol Akarsu




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unit of dimension for solr field

2013-11-11 Thread Ryan Cutter
I think Upayavira's suggestion of writing a filter factory fits what you're
asking for.  However, the other end of cleverness is to simple use
solr.TrieIntField and store everything in MB.  So for 1TB you'd
write 51200.  A range query for 256MB to 1GB would be field:[256 TO 1024].

Conversion from MB to your displayed unit (2TB, for example) would happen
in the application layer.  But using trie ints would be simple and
efficient.

- Ryan


On Mon, Nov 11, 2013 at 7:06 AM, eakarsu eaka...@gmail.com wrote:

 Thanks Upayavira

 It seems it needs too much work. I will have several more fields that will
 have unit values.
 Do we have more quicker way of implementing it?

 We have Currency filed coming as default with SOLR. Can we use it?
 Creating conversion rate table for each field? What I am expecting from
 units is similar to currency field

 Erol Akarsu




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unit of dimension for solr field

2013-11-11 Thread Jack Krupansky
A custom token filter may indeed be the right way to go, but an alternative 
is the combination of an update processor and a query preprocessor.


The update processor, which could be a JavaScript script could normalize the 
string into a simple integer byte count. You might also want to keep 
separate fields, one for the raw string and one for the final byte count. A 
JavaScript script would be a lot easier to develop than a custom token 
filter.


A query preprocessor could do two things: First, the same string to byte 
count normalization as the update processor, plus generate a range query. 
So, for example, a query for 0.5 TB could match 512 GB, 500 GB, etc, with 
[5000 TO 4999].


Technically, you could implement a query preprocessor as a plugin Solr 
search component, but if that sounds like too much effort, an 
application-level implementation would probably be easier to master.


-- Jack Krupansky

-Original Message- 
From: Ryan Cutter

Sent: Monday, November 11, 2013 10:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Unit of dimension for solr field

I think Upayavira's suggestion of writing a filter factory fits what you're
asking for.  However, the other end of cleverness is to simple use
solr.TrieIntField and store everything in MB.  So for 1TB you'd
write 51200.  A range query for 256MB to 1GB would be field:[256 TO 1024].

Conversion from MB to your displayed unit (2TB, for example) would happen
in the application layer.  But using trie ints would be simple and
efficient.

- Ryan


On Mon, Nov 11, 2013 at 7:06 AM, eakarsu eaka...@gmail.com wrote:


Thanks Upayavira

It seems it needs too much work. I will have several more fields that will
have unit values.
Do we have more quicker way of implementing it?

We have Currency filed coming as default with SOLR. Can we use it?
Creating conversion rate table for each field? What I am expecting from
units is similar to currency field

Erol Akarsu




--
View this message in context:
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Ryan and Upayavira,

Do we have an example skeleton to do this for schema.xml and solrconfig.xml?
Example java class that would help to build UnitResolvingFilterFactory
class?

Thanks

Erol Akarsu



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100303.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Can DelimitedPayloadTokenFilterFactory be used to store unit dimension
information? This factory class can store extra information for field.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100345.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unit of dimension for solr field

2013-11-11 Thread Erick Erickson
You seem to be consistently missing the problem that your queries will not
work as expected. How would you do a range query without writing a some
kind of custom code that looked at the payloads to determine the normalized
units?

The simplest way to do this is probably have your ingestion side normalize.
Put the original (complete with units) in a field that has indexed=false,
this will only be used for showing in the results list.

_Also_ add the normalized field to another filed that you set
indexed=true and stored=false to. that will allow range searches,
faceting, etc.

HTH,
Erick


On Mon, Nov 11, 2013 at 2:36 PM, eakarsu eaka...@gmail.com wrote:

 Can DelimitedPayloadTokenFilterFactory be used to store unit dimension
 information? This factory class can store extra information for field.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100345.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Unit of dimension for solr field

2013-11-10 Thread eakarsu
I would like to have a SOLR field that has multiple unit of dimension.
Suppose we store the memory value of a computer in solr field. That can have
value 256 MB, 512 MB, or 1 GB where we use MB and GB units. Same case is for
hard drive sizes : 256 MB,50GB or 3TB where we use MB,GB and TB units.

How can I store these unit of dimensions with values itself? I would like to
have range queries on such fields: say bring me desktops that has 256M-1G
memory values.

I appreciate any guidance

Thanks

Erol Akarsu



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unit of dimension for solr field

2013-11-10 Thread Upayavira
It really depends upon how clever you want to be.

If I were to do it, I would push two versions into Solr, one with MB or
GB in, for display, and another, resolved to a number, for faceting and
querying. I.e do the work outside Solr.

If you did want to be clever, you could use a KeywordTokenizer in an
analysis chain (it spits out just a single token) and then write your
own UnitResolvingFilterFactory, which you could configure with mapping
so such as KB-1024, and spits out integer or float fields.

This should then work for querying, as querystring terms would be
analysed. It would be neat because the stored.field values would include
the pretty units, whilst the indexed values would be pure numbers. You
could use the field for range faceting, but you would get the indexed
value, I.e. Without the units, as faceting uses the indexed value not
the stored one.

Upayavira

On Mon, Nov 11, 2013, at 02:49 AM, eakarsu wrote:
 I would like to have a SOLR field that has multiple unit of dimension.
 Suppose we store the memory value of a computer in solr field. That can
 have
 value 256 MB, 512 MB, or 1 GB where we use MB and GB units. Same case is
 for
 hard drive sizes : 256 MB,50GB or 3TB where we use MB,GB and TB units.
 
 How can I store these unit of dimensions with values itself? I would like
 to
 have range queries on such fields: say bring me desktops that has 256M-1G
 memory values.
 
 I appreciate any guidance
 
 Thanks
 
 Erol Akarsu
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209.html
 Sent from the Solr - User mailing list archive at Nabble.com.