Re: translating a character code to an ordinal?

2013-06-10 Thread Erick Erickson
You can use copyField. All it does is send the raw data to
the second field, the fact that they're different types is
irrelevant.

Why not just give it a try?

Erick

On Fri, Jun 7, 2013 at 8:08 PM, geeky2 gee...@hotmail.com wrote:
 hello jack,

 thank you for the code ;)

 what book are you referring to?  AFAICT - all of the 4.0 books are future
 order.

 we won't be moving to 4.0 (soon enough).

 so i take it - copyfield will not work, eg - i cannot take a code like ABC
 and copy it to an int field and then use the regex to turn it in to an
 ordinal?

 thx
 mark




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068984.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it.

i guess i made a poor assumption that you would not get predictable
results when copying a code like mycode to an int field where where the
desired end result in the int field is say, 1.

i was worried that some sort of ascii conversion or wrap around would
happen in the int field.

thx for the insight.

mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069335.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-10 Thread Erick Erickson
Hmmm, that may be a wrinkle. I'm actually not sure
what'll happen if the _raw_ thing you copy to the
int field is not an int (or whatever). You spoke of
character code translation, so it may blow up. In which
case I'd consider a custom update processor that read
the source field, performed whatever mods you want
to it and added the dest field.

You _might_ get away with the dest field doing the
translation with a PatternReplaceCharFilterFactory,
which processes the input stream before it gets
analyzed as well

FWIW,
Erick

On Mon, Jun 10, 2013 at 8:43 AM, geeky2 gee...@hotmail.com wrote:
 i will try it.

 i guess i made a poor assumption that you would not get predictable
 results when copying a code like mycode to an int field where where the
 desired end result in the int field is say, 1.

 i was worried that some sort of ascii conversion or wrap around would
 happen in the int field.

 thx for the insight.

 mark




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069335.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it out and let you know - 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069339.html
Sent from the Solr - User mailing list archive at Nabble.com.


translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello all,

environment: solr 3.5, centos

problem statement:  i have several character codes that i want to translate
to ordinal (integer) values (for sorting), while retaining the original code
field in the document.

i was thinking that i could use a copyField from my code field to my ord
field - then employ a pattern replace filter factory during indexing.

but won't the copyfield fail because the two field types are different?

ps: i also read the wiki about
http://wiki.apache.org/solr/DataImportHandler#Transformer the script
transformer and regex transformer - but was hoping to avoid this - if i
could.




thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-07 Thread Jack Krupansky
This won't help you unless you move to Solr 4.0, but here's an update 
processor script from the book that can take the first character of a string 
field and add it as an integer value for another field:


 updateRequestProcessorChain name=script-add-char-code
   processor class=solr.StatelessScriptUpdateProcessorFactory
 str name=scriptadd-char-code.js/str
 lst name=params
   str name=fieldNamecontent/str
   str name=codeFieldNamecontent_code_i/str
 /lst
   /processor
   processor class=solr.LogUpdateProcessorFactory /
   processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

Here is the JavaScript script that should be placed in the 
add-char-code.js file in the conf directory for

the Solr collection:

 function processAdd(cmd) {
   var fieldName;
   var codeFieldName;
   if (typeof params !== undefined) {
 fieldName = params.get(fieldName);
 codeFieldName = params.get(codeFieldName);
   }
   if (fieldName == null)
 fieldName = content;
   if (codeFieldName == null)
 codeFieldName = content_code_i;

   // Get value for named field, no-op if empty
   var value = cmd.getSolrInputDocument().getField(fieldName);
   if (value != null){
 var str = value.getFirstValue();

 // No-op if string is empty
 if (str != null  str.length() != 0){
   // Get code for first character
   var code = str.charCodeAt(0);
   logger.info(String: \ + str + \ len:  + str.length() +  code: 
 + code);


   // Set the character code output field value
   cmd.getSolrInputDocument().addField(codeFieldName, code);
 }
   }
 }

 function processDelete() {
   // Dummy - add if needed
 }

 function processCommit() {
   // Dummy - add if needed
 }

 function processRollback() {
   // Dummy - add if needed
 }

 function processMergeIndexes() {
   // Dummy - add if needed
 }

 function finish() {
   // Dummy - add if needed
 }

Test it:

 curl 
http://localhost:8983/solr/update?commit=trueupdate.chain=script-add-char-code; 
\

 -H 'Content-type:application/json' -d '
 [{id: doc-1, content: abc},
  {id: doc-2, content: 1},
  {id: doc-3, content: },
  {id: doc-4},
  {id: doc-5, content: \u0002 abc},
  {id: doc-6, content: [And, this, is the end, of this 
test.]}]'


Results:

 id:doc-1,
 content:[abc],
 content_code_i:97,

 id:doc-2,
 content:[1],
 content_code_i:49,

 id:doc-3,
 content:[],

 id:doc-4,

 id:doc-5,
 content:[\u0002 abc],
 content_code_i:2,

 id:doc-6,
 content:[And, this,
   is the end,
   of this test.],
 content_code_i:65,

-- Jack Krupansky

-Original Message- 
From: geeky2

Sent: Friday, June 07, 2013 6:27 PM
To: solr-user@lucene.apache.org
Subject: translating a character code to an ordinal?

hello all,

environment: solr 3.5, centos

problem statement:  i have several character codes that i want to translate
to ordinal (integer) values (for sorting), while retaining the original code
field in the document.

i was thinking that i could use a copyField from my code field to my ord
field - then employ a pattern replace filter factory during indexing.

but won't the copyfield fail because the two field types are different?

ps: i also read the wiki about
http://wiki.apache.org/solr/DataImportHandler#Transformer the script
transformer and regex transformer - but was hoping to avoid this - if i
could.




thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello jack,

thank you for the code ;)

what book are you referring to?  AFAICT - all of the 4.0 books are future
order.

we won't be moving to 4.0 (soon enough).

so i take it - copyfield will not work, eg - i cannot take a code like ABC
and copy it to an int field and then use the regex to turn it in to an
ordinal?

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068984.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-07 Thread Jack Krupansky
Correct, you need either an update request processor, a custom field type, 
or to preprocess your input before you give it to Solr.


You can't do analysis on a non-text field.

The book is my new Solr reference/guide that I will be self-publishing. We 
hope to make an Alpha draft available later next week.


-- Jack Krupansky
-Original Message- 
From: geeky2

Sent: Friday, June 07, 2013 8:08 PM
To: solr-user@lucene.apache.org
Subject: Re: translating a character code to an ordinal?

hello jack,

thank you for the code ;)

what book are you referring to?  AFAICT - all of the 4.0 books are future
order.

we won't be moving to 4.0 (soon enough).

so i take it - copyfield will not work, eg - i cannot take a code like ABC
and copy it to an int field and then use the regex to turn it in to an
ordinal?

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068984.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
thx,


please send me a link to the book so i get/purchase it.


thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068997.html
Sent from the Solr - User mailing list archive at Nabble.com.