Re: Data import

2013-09-10 Thread Luís Portela Afonso
OK, that makes sense, but when solr when run dataimport identifies the new an 
existing document with the same uniquekey that is being indexed,right?
Because when the same document exists on the source, it deletes it and creates 
a new one. Instead of that, is not possible to discard the new document instead 
of delete and create a new one?

On Sep 10, 2013, at 2:16 AM, Alexandre Rafalovitch arafa...@gmail.com wrote:

 Sounds like you want a custom UpdateRequestProcessor chain that checks if
 the document already exists with given primary key and does not even bother
 passing it on to the next processor in the chain.
 
 This would make sense as an optimization or as a first step in a complex
 update chain that perhaps uses a lot of external resources to pre-process
 the content (e.g. named entities extraction).
 
 I don't think such URP exist at the moment? But it should be simple to
 write one assuming URPs can do lookups by primary IDs and have go/no-go
 decisions on individual documents. Anybody knows the details of this?
 
 Regards,
   Alex.
 
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
 
 
 On Tue, Sep 10, 2013 at 7:53 AM, Luis Portela Afonso meligalet...@gmail.com
 wrote:
 
 But with atomic updates i need to send the information, right?
 
 I want that solr automatic indexes it. And he is doing that. Can you look
 at the solr example in the source?
 There is an example on example-DIH folder.
 
 Imagine that you run the URL to import the data every 15 minutes. If the
 same information is already indexed, solr will update it, and by update I
 mean delete and index again.
 
 I just want that solr simple discards the information if this already
 exists with indexed.
 
 On Tuesday, September 10, 2013, Chris Hostetter wrote:
 
 
 : With cron job, I do a http request using curl, to the address
 : http://localhost:port
 /solr/core/dataimport/?command=full-importclean=false
 :
 : When it runs, if the rss source has a feed that is already indexed on
 solr,
 : it updates the existing source.
 : So if the source has the same information of the destiny, it updates
 the
 : information on the destiny.
 :
 : I want to prevent that. Is that explicit? I may try to provide some
 : examples.
 
 Yes, specific examples would be helpful -- it's not really clear what it
 is that you want to prevent.
 
 Please note the URL i mentioned before and use it as a guideline for
 how much detail we need to understand what it is you are asking...
 
 :  Can you please be more specific about what you would like to see
 happen,
 :  we can better understand what your actual goal is?  It's really not
 clear
 
 :  https://wiki.apache.org/solr/UsingMailingLists
 
 
 
 -Hoss
 
 
 
 --
 Sent from Gmail Mobile
 



smime.p7s
Description: S/MIME cryptographic signature


Re: Javascript StatelessScriptUpdateProcessor

2013-09-10 Thread Luís Portela Afonso
Solved
On Sep 10, 2013, at 4:55 PM, Luís Portela Afonso meligalet...@gmail.com wrote:

 It's that possible to execute queries on a javascript script on 
 StatelessScriptUpdateProcessor.
 I'm processing data with a javascript i want to execute a query to the 
 indexed data of solr.
 
 I know that the javascript script, has an instance of SolrQueryRequest and 
 SolrQueryResponse, but neither can be used. At least i'm not being able to 
 use it.



smime.p7s
Description: S/MIME cryptographic signature


Javascript StatelessScriptUpdateProcessor

2013-09-10 Thread Luís Portela Afonso
It's that possible to execute queries on a javascript script on 
StatelessScriptUpdateProcessor.
I'm processing data with a javascript i want to execute a query to the indexed 
data of solr.

I know that the javascript script, has an instance of SolrQueryRequest and 
SolrQueryResponse, but neither can be used. At least i'm not being able to use 
it.

smime.p7s
Description: S/MIME cryptographic signature


Re: Data import

2013-09-09 Thread Luís Portela Afonso
When i run  dataimport/?command=full-importclean=false, solr add new 
documents with the information. But if the same information already exists with 
the same uniquekey, it replaces the existing document with a new one.
It does not update the document, it creates a new one. It's that possible?

I'm indexing rss feeds. I run the rss example that exists in the solr examples, 
and i does that.

On Sep 9, 2013, at 4:10 AM, Alexandre Rafalovitch arafa...@gmail.com wrote:

 What do you specifically mean by the disable document update? Do you mean
 in-place update? Or do you mean you want to run the import but not actually
 populate Solr collection with processed documents?
 
 It might help to explain the business level goal you are trying to achieve.
 Or, specific error that you are perhaps seeing and trying to avoid.
 
 Regards,
   Alex.
 
 
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
 
 
 On Mon, Sep 9, 2013 at 6:42 AM, Luís Portela Afonso
 meligalet...@gmail.comwrote:
 
 Hi,
 
 It's possible to disable document update when running data import,
 full-import command?
 
 Thanks



smime.p7s
Description: S/MIME cryptographic signature


Data import

2013-09-08 Thread Luís Portela Afonso
Hi,

It's possible to disable document update when running data import, full-import 
command?

Thanks

smime.p7s
Description: S/MIME cryptographic signature


Re: Solr documents update on index

2013-09-06 Thread Luís Portela Afonso
Hi,

But i'm indexing rss feeds. I want that solr indexes that without change the 
existing information of a document with the same uniqueKey.
The best approach is that solr updates the doc if changes are detected, but i 
can leave without that.

I really would like that solr does not update the document if it already exists.

I'm using the DataImportScheduler to solr index launch the scheduled index.

Appreciate any possible help.

On Sep 6, 2013, at 9:16 AM, Shalin Shekhar Mangar shalinman...@gmail.com 
wrote:

 Yes, if a document with the same key exists, then the old document
 will be deleted and replaced with the new document. You can also
 partially update documents (we call it atomic updates) which reads the
 old document from local index, updates it according to the request and
 then replaces the old document with the new one.
 
 See 
 https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-UpdatingOnlyPartofaDocument
 
 On Fri, Sep 6, 2013 at 1:03 AM, Luis Portela Afonso
 meligalet...@gmail.com wrote:
 Hi,
 
 I'm having a problem when solr indexes.
 It is updating documents already indexed. Is this a normal behavior?
 If a document with the same key already exists is it supposed to be updated?
 I has thinking that is supposed to just update if the information on the
 rss has changed.
 
 Appreciate your help
 
 --
 Sent from Gmail Mobile
 
 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.



smime.p7s
Description: S/MIME cryptographic signature


Re: SOLR Prevent solr of modifying fields when update doc

2013-08-23 Thread Luís Portela Afonso
Hi thanks by the answer, but the uniqueId is generated by me. But when solr 
indexes and there is an update in a doc, it deletes the doc and creates a new 
one, so it generates a new UUID.
It is not suitable for me, because i want that solr just updates some fields, 
because the UUID is the key that i use to map it to an user in my database.

Right now i'm using information that comes from the source and never chages, as 
my uniqueId, like for example the guid, that exists in some rss feeds, or if it 
doesn't exists i use link.

I think there is any simple solution for me, because for what i have read, when 
an update to a doc exists, SOLR deletes the old one and create a new one, right?

On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote:

 Well, not much in the way of help because you can't do what you
 want AFAIK. I don't think UUID is suitable for your use-case. Why not
 use your uniqueId?
 
 Or generate something yourself...
 
 Best
 Erick
 
 
 On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com
 wrote:
 
 Hi,
 
 How can i prevent solr from update some fields when updating a doc?
 The problem is, i have an uuid with the field name uuid, but it is not an
 unique key. When a rss source updates a feed, solr will update the doc with
 the same link but it generates a new uuid. This is not the desired because
 this id is used by me to relate feeds with an user.
 
 Can someone help me?
 
 Many Thanks



smime.p7s
Description: S/MIME cryptographic signature


SOLR Prevent solr of modifying fields when update doc

2013-08-22 Thread Luís Portela Afonso
Hi,

How can i prevent solr from update some fields when updating a doc?
The problem is, i have an uuid with the field name uuid, but it is not an 
unique key. When a rss source updates a feed, solr will update the doc with the 
same link but it generates a new uuid. This is not the desired because this id 
is used by me to relate feeds with an user.

Can someone help me?

Many Thanks

smime.p7s
Description: S/MIME cryptographic signature


SOLR Copy field if no value on destination

2013-08-07 Thread Luís Portela Afonso
Hi,

Is possible to copy a value of a field to another if the destination doesn't 
have value?
An example:
Indexing an rss
The feed has the fields link and guid, but sometimes guid cannot be present in 
the feed
I have a field that i will copy values with the name finalLink

Now i want to copy guid to finalLink, but if guid has not value i want to copy 
link. 

My question is, is that possible just with the schema, Processors, 
solrconfig.xml, and the data-config?

Thanks a lot

smime.p7s
Description: S/MIME cryptographic signature


Re: SOLR Copy field if no value on destination

2013-08-07 Thread Luís Portela Afonso
Oh yeah. Hi have seen that Processor on the book and i was not able to 
remember. Thanks a lot.
And thanks a lot for your solution. It works :)

On Aug 8, 2013, at 1:52 AM, Jack Krupansky j...@basetechnology.com wrote:

 Here's the actual update processor I used (and tested):
 
 updateRequestProcessorChain name=first-default-field
 processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourcemain_s/str
   str name=destfinal_s/str
 /processor
 processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourcebackup_s/str
   str name=destfinal_s/str
 /processor
 processor class=solr.FirstFieldValueUpdateProcessorFactory
   str name=fieldNamefinal_s/str
 /processor
 processor class=solr.LogUpdateProcessorFactory /
 processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain
 
 -- Jack Krupansky
 
 -Original Message- From: Jack Krupansky
 Sent: Wednesday, August 07, 2013 8:20 PM
 To: solr-user@lucene.apache.org
 Subject: Re: SOLR Copy field if no value on destination
 
 Sorry, I am unable to untangle the logic you are expressing, but I can can 
 assure you that  JavaScript and the StatelessScriptUpdate processor has full 
 support for implementing spaghetti code logic as tangled as desired!
 
 Simpler forms of logic can be implemented directly using non-script update 
 processor sequences, but once you start adding conditionals, there is a 50% 
 chance that you will need a script.
 
 There is a Default Value update processor, but it takes a literal value.
 
 Hmmm... maybe I’ll come up with a “default-value” script that takes a field 
 name for the default value. IOW, it would copy a specified field to the 
 destination IFF the destination had no value.
 
 Ahhh... wait... maybe... you could do this with the First Value Update 
 processor:
 
 1. Copy guid to FinalLink. (Clone Update processor).
 2. Copy link to FinalLink. (Clone Update processor).
 3. First Value Update processor.
 
 So, step 3 would leave link if guid was not there, or keep guid if it is 
 there and discard link.
 
 Yes, that should do it.
 
 This is worth an example in the book! Thanks for the inspiration!
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso
 Sent: Wednesday, August 07, 2013 7:22 PM
 To: solr-user@lucene.apache.org
 Subject: SOLR Copy field if no value on destination
 
 Hi,
 
 Is possible to copy a value of a field to another if the destination doesn't 
 have value?
 An example:
 a.. Indexing an rss
 b.. The feed has the fields link and guid, but sometimes guid cannot be 
 present in the feed
 c.. I have a field that i will copy values with the name finalLink
 
 Now i want to copy guid to finalLink, but if guid has not value i want to 
 copy link.
 
 My question is, is that possible just with the schema, Processors, 
 solrconfig.xml, and the data-config?
 
 Thanks a lot 



smime.p7s
Description: S/MIME cryptographic signature


SOLR FieldCopyProcessorFactory

2013-08-05 Thread Luís Portela Afonso
Hi,

Exists something like FieldCopyProcessorFactory. I know there is a 
CloneFieldProfessor, but i'm interested to do an append. Is that possible?

Many Thanks

smime.p7s
Description: S/MIME cryptographic signature


Field append

2013-08-05 Thread Luís Portela Afonso
Hi there,

Is that possible to append two fields on solr? i would like to append to 
filters with a custom delimiter. Is that possible?
I saw something like a CloneFieldUpdateProcessor, but when i try to use, solr 
says that cannot find the class. I saw that in the follow site: 
https://issues.apache.org/jira/browse/SOLR-2599

In the comments i saw:
processor class=solr.FieldCopyProcessorFactory
  str name=sourcecategory/str
  str name=destcategory_s/str
/processor

But i'm not able to use it too. Once again solr says that cannot find class.

Hope you can help in any way. Thanks



smime.p7s
Description: S/MIME cryptographic signature


Re: Solr PolyField

2013-08-01 Thread Luís Portela Afonso
Hi,

I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool but 
the fields are not copied.

My dataconfig.xml

field column=enclosure_type xpath=/rss/channel/item/enclosure/@type /

My schema.xml

  dynamicField name=enclosure_* type=string indexed=false stored=true 
multiValued=true /
!-- /field --
!-- dynamicField name=enclosure_* type=string indexed=false 
stored=true multiValued=false / --

field name=enclosure type=text indexed=true stored=true 
multiValued=true /

My solrconfig.xml

updateRequestProcessorChain name=multiple-clones
 processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourceenclosure_title/str
   str name=destenclosure/str
 /processor
   /updateRequestProcessorChain

and

requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configrss-data-config.xml/str
  str name=update.chainmultiple-clones/str
  str name=update.chainfixIndexedValues/str
/lst

   /requestHandler

Can you help? Thanks ;)

On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso meligalet...@gmail.com wrote:

 Ok, thanks. I will check it.
 
 On Jul 31, 2013, at 5:08 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 See:
 https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
 
 I have more examples in my book.
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso 
 Sent: Wednesday, July 31, 2013 11:41 AM
 To: solr-user@lucene.apache.org 
 Subject: Re: Solr PolyField
 
 Hum, ok. 
 
 It's possible to add to a field, static text? Text that i write on the 
 configuration and then append another field? I saw something like 
 CloneFieldProcessor but when i'm starting solr, it says that could not find 
 the class.
 I was trying to use processors to move one field to another.
 
 I saw this:
 processor class=solr.FieldCopyProcessorFactory
 str name=sourcelastname firstname/str
 str name=destfullname/str
 bool name=appendtrue/bool
 str name=append.delim, /str
 /processor
 But when i try to use it solr says that he cannot find the 
 solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
 
 Thanks ;)
 
 On Jul 31, 2013, at 4:16 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 
 OK,
 
 Then I would suggest creating multiValued enclosure_type, etc. tags for
 searching, and then one string-typed field to store the JSON snippet you've
 been showing.
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 
   As a single record? Hum, no.
 
   So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
   Each /rss/channel/item is a new document on Solr. I start with the solr
   example rss, but i change that to has more fields, other fields and get the
   feed url from a database.
 
   So each /rss/channel/item is a document to the indexing, bue each
   /rss/channel/item can have more than on enclosure tag.
 
   Many thanks
 
   On Jul 31, 2013, at 4:05 PM, Michael Della Bitta 
   michael.della.bi...@appinions.com wrote:
 
 
 So you're trying to index a RSS feed as a single record, but you want to
 
   be
 
 able to search for and retrieve individual entries from within the feed?
 
   Is
 
 that the issue?
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
   
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 
   This fields can be multiValued.
   I the rss standart there is not correct to do that, but some sources do
   and i like to grab it all. Is there any way that make it possible?
 
   Once again, Many thanks :)
 
   On Jul 31, 2013, at 3:54 PM, Michael Della Bitta 
   michael.della.bi...@appinions.com wrote:
 
 
 Luís,
 
 Is there a reason why splitting this up into enclosure_type,
 
   enclosure_url,
 
 and enclosure_length would not work?
 
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence

Re: Solr PolyField

2013-08-01 Thread Luís Portela Afonso
So i have merged the two chains in one, and this is not copying. Hum…

My solrconfig.xml

updateRequestProcessorChain name=fixIndexedValues
!-- processor class=solr.UUIDUpdateProcessorFactory
  str name=fieldNameuuid/str
/processor --
processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourceenclosure_title/str
   str name=destenclosure/str
 /processor
processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain

I try too with the UUIDUpdateProcessorFactory commented and nothing happens. 
Weird.

On Aug 1, 2013, at 5:37 PM, Jack Krupansky j...@basetechnology.com wrote:

 Hmmm... not sure what happens if you have two update chains specified:
 
  str name=update.chainmultiple-clones/str
  str name=update.chainfixIndexedValues/str
 
 You need to merge them into one.
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso 
 Sent: Thursday, August 01, 2013 12:26 PM
 To: solr-user@lucene.apache.org 
 Subject: Re: Solr PolyField
 
 Hi, 
 
 I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool 
 but the fields are not copied.
 
 My dataconfig.xml
 
 field column=enclosure_type xpath=/rss/channel/item/enclosure/@type /
 
 My schema.xml
 
  dynamicField name=enclosure_* type=string indexed=false stored=true 
 multiValued=true /
!-- /field --
!-- dynamicField name=enclosure_* type=string indexed=false 
 stored=true multiValued=false / --
 
field name=enclosure type=text indexed=true stored=true 
 multiValued=true /
 
 My solrconfig.xml
 
 updateRequestProcessorChain name=multiple-clones
 processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourceenclosure_title/str
   str name=destenclosure/str
 /processor
   /updateRequestProcessorChain
 
 and
 
 requestHandler name=/dataimport
   class=org.apache.solr.handler.dataimport.DataImportHandler
   lst name=defaults
   str name=configrss-data-config.xml/str
  str name=update.chainmultiple-clones/str
  str name=update.chainfixIndexedValues/str
   /lst
 
   /requestHandler
 
 Can you help? Thanks ;)
 
 On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso meligalet...@gmail.com 
 wrote:
 
 
  Ok, thanks. I will check it.
 
  On Jul 31, 2013, at 5:08 PM, Jack Krupansky j...@basetechnology.com 
 wrote:
 
 
See:

 https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
 
I have more examples in my book.
 
-- Jack Krupansky
 
From: Luís Portela Afonso 
Sent: Wednesday, July 31, 2013 11:41 AM
To: solr-user@lucene.apache.org 
Subject: Re: Solr PolyField
 
Hum, ok. 
 
It's possible to add to a field, static text? Text that i write on the 
 configuration and then append another field? I saw something like 
 CloneFieldProcessor but when i'm starting solr, it says that could not find 
 the class.
I was trying to use processors to move one field to another.
 
I saw this:
processor class=solr.FieldCopyProcessorFactory
str name=sourcelastname firstname/str
str name=destfullname/str
bool name=appendtrue/bool
str name=append.delim, /str
/processor
But when i try to use it solr says that he cannot find the 
 solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
 
Thanks ;)
 
On Jul 31, 2013, at 4:16 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 
OK,
 
Then I would suggest creating multiValued enclosure_type, etc. tags for
searching, and then one string-typed field to store the JSON snippet you've
been showing.
 
Michael Della Bitta
 
Applications Developer
 
o: +1 646 532 3062  | c: +1 917 477 7906
 
appinions inc.
 
“The Science of Influence Marketing”
 
18 East 41st Street
 
New York, NY 10017
 
t: @appinions https://twitter.com/Appinions | g+:

 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/
 
 
On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso 
meligalet...@gmail.com wrote:
 
 
  As a single record? Hum, no.
 
  So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
  Each /rss/channel/item is a new document on Solr. I start with the solr
  example rss, but i change that to has more fields, other fields and get 
 the
  feed url from a database.
 
  So each /rss/channel/item is a document to the indexing, bue each
  /rss/channel/item can have more than on enclosure tag.
 
  Many thanks
 
  On Jul 31, 2013, at 4:05 PM, Michael Della Bitta 
  michael.della.bi...@appinions.com wrote:
 
 
So you're trying to index a RSS feed as a single record, but you want 
 to
 
  be
 
able to search for and retrieve individual entries from within the 
 feed?
 
  Is
 
that the issue?
 
Michael Della Bitta
 
Applications

Re: Solr PolyField

2013-08-01 Thread Luís Portela Afonso
Oh my god. Thanks for notice. The field name its wrong. It should be 
enclosure_type. I'm so sorry.

On Aug 1, 2013, at 6:33 PM, Jack Krupansky j...@basetechnology.com wrote:

 Are you sure the “enclosure_title” field is populated?
 
 Have you updated the request handler?
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso 
 Sent: Thursday, August 01, 2013 1:23 PM
 To: solr-user@lucene.apache.org 
 Subject: Re: Solr PolyField
 
 So i have merged the two chains in one, and this is not copying. Hum… 
 
 My solrconfig.xml
 
 updateRequestProcessorChain name=fixIndexedValues
!-- processor class=solr.UUIDUpdateProcessorFactory
  str name=fieldNameuuid/str
/processor --
processor class=solr.CloneFieldUpdateProcessorFactory
   str name=sourceenclosure_title/str
   str name=destenclosure/str
 /processor
processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain
 
 I try too with the UUIDUpdateProcessorFactory commented and nothing happens. 
 Weird.
 
 On Aug 1, 2013, at 5:37 PM, Jack Krupansky j...@basetechnology.com wrote:
 
 
  Hmmm... not sure what happens if you have two update chains specified:
 
   str name=update.chainmultiple-clones/str
   str name=update.chainfixIndexedValues/str
 
  You need to merge them into one.
 
  -- Jack Krupansky
 
  From: Luís Portela Afonso 
  Sent: Thursday, August 01, 2013 12:26 PM
  To: solr-user@lucene.apache.org 
  Subject: Re: Solr PolyField
 
  Hi, 
 
  I have tried the solr.CloneFieldUpdateProcessorFactory sugested in the pool 
 but the fields are not copied.
 
  My dataconfig.xml
 
  field column=enclosure_type xpath=/rss/channel/item/enclosure/@type /
 
  My schema.xml
 
  dynamicField name=enclosure_* type=string indexed=false stored=true 
 multiValued=true /
 !-- /field --
 !-- dynamicField name=enclosure_* type=string indexed=false 
 stored=true multiValued=false / --
 
 field name=enclosure type=text indexed=true stored=true 
 multiValued=true /
 
  My solrconfig.xml
 
  updateRequestProcessorChain name=multiple-clones
  processor class=solr.CloneFieldUpdateProcessorFactory
str name=sourceenclosure_title/str
str name=destenclosure/str
  /processor
/updateRequestProcessorChain
 
  and
 
  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configrss-data-config.xml/str
   str name=update.chainmultiple-clones/str
   str name=update.chainfixIndexedValues/str
/lst
 
/requestHandler
 
  Can you help? Thanks ;)
 
  On Jul 31, 2013, at 6:03 PM, Luís Portela Afonso meligalet...@gmail.com 
 wrote:
 
 
  Ok, thanks. I will check it.
 
  On Jul 31, 2013, at 5:08 PM, Jack Krupansky j...@basetechnology.com 
 wrote:
 
 
 See:
 
 https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
 
 I have more examples in my book.
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso 
 Sent: Wednesday, July 31, 2013 11:41 AM
 To: solr-user@lucene.apache.org 
 Subject: Re: Solr PolyField
 
 Hum, ok. 
 
 It's possible to add to a field, static text? Text that i write on the 
 configuration and then append another field? I saw something like 
 CloneFieldProcessor but when i'm starting solr, it says that could not find 
 the class.
 I was trying to use processors to move one field to another.
 
 I saw this:
 processor class=solr.FieldCopyProcessorFactory
 str name=sourcelastname firstname/str
 str name=destfullname/str
 bool name=appendtrue/bool
 str name=append.delim, /str
 /processor
 But when i try to use it solr says that he cannot find the 
 solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
 
 Thanks ;)
 
 On Jul 31, 2013, at 4:16 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 
 OK,
 
 Then I would suggest creating multiValued enclosure_type, etc. tags for
 searching, and then one string-typed field to store the JSON snippet 
 you've
 been showing.
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 
   As a single record? Hum, no.
 
   So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
   Each /rss/channel/item is a new document on Solr. I start with the solr
   example rss, but i change that to has more fields, other fields and get 
 the
   feed url from a database.
 
   So

Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hi, I'm trying to create a field with multiple fields inside, that is:

origin: {
htmlUrl: http://www.gazzetta.it/;,
streamId: feed/http://www.gazzetta.it/rss/Home.xml;,
title: Gazzetta.it
},

Get something like this. Is that possible? I'm using Solr 4.4.0.

Thanks

smime.p7s
Description: S/MIME cryptographic signature


Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hi,

I'm trying to index information of RSS Feeds.

So in a more detailed explanation:

The RSS feed has something like: 
enclosure url=http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3; 
length=32642192 type=audio/mpeg/

With my current configuration, this is working and i get a result like that:

enclosure: [
audio/mpeg,
http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
37521428
],

BUT, this is not the result that i'm trying to reach. With that i'm not able to 
know in a correct way, if audio/mpeg is the type, or the url, or the length.

I want to reach something like:

enclosure: {
type: audio/mpeg,
url: http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
length: 37521428
},


So, how i intend this, this should be 3 fields inside of another field, no?


Many Thanks for the answer and the help.


On Jul 31, 2013, at 3:34 PM, Erick Erickson erickerick...@gmail.com wrote:

 Nope. Solr fields are flat. Why do you want to do this? I'm
 asking because this might be an XY problems and there
 may be other possibilities.
 
 Best
 Erick
 
 On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
 meligalet...@gmail.com wrote:
 Hi, I'm trying to create a field with multiple fields inside, that is:
 
 origin:
 {
 
 htmlUrl: http://www.gazzetta.it/;,
 streamId: feed/http://www.gazzetta.it/rss/Home.xml;,
 title: Gazzetta.it
 
 },
 
 
 Get something like this. Is that possible? I'm using Solr 4.4.0.
 
 Thanks



smime.p7s
Description: S/MIME cryptographic signature


Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
This fields can be multiValued.
I the rss standart there is not correct to do that, but some sources do and i 
like to grab it all. Is there any way that make it possible?

Once again, Many thanks :)

On Jul 31, 2013, at 3:54 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 Luís,
 
 Is there a reason why splitting this up into enclosure_type, enclosure_url,
 and enclosure_length would not work?
 
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 Hi,
 
 I'm trying to index information of RSS Feeds.
 
 So in a more detailed explanation:
 
 The RSS feed has something like:
 enclosure url=http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3;
 length=32642192 type=audio/mpeg/
 
 *With my current configuration, this is working and i get a result like
 that:*
 
 
   - enclosure:
   [
  - audio/mpeg,
  - http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
  - 37521428
  ],
 
 
 *BUT,* this is not the result that i'm trying to reach. With that i'm not
 able to know in a correct way, if audio/mpeg is the *type*, or the *
 url,* or the *length*.
 *
 *
 *I want to reach something like:*
 
   -
   - enclosure:
   {
  - type: a http://www.gazzetta.it/udio/mpeg,
  - url:
  http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
  - length: 37521428
  },
 
 
 
 So, how i intend this, this should be 3 fields inside of another field, no?
 
 
 Many Thanks for the answer and the help.
 
 
 On Jul 31, 2013, at 3:34 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
 Nope. Solr fields are flat. Why do you want to do this? I'm
 asking because this might be an XY problems and there
 may be other possibilities.
 
 Best
 Erick
 
 On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
 meligalet...@gmail.com wrote:
 
 Hi, I'm trying to create a field with multiple fields inside, that is:
 
 origin:
 {
 
 htmlUrl: http://www.gazzetta.it/;,
 streamId: feed/http://www.gazzetta.it/rss/Home.xml;,
 title: Gazzetta.it
 
 },
 
 
 Get something like this. Is that possible? I'm using Solr 4.4.0.
 
 Thanks
 
 
 



smime.p7s
Description: S/MIME cryptographic signature


Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
As a single record? Hum, no.

So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
Each /rss/channel/item is a new document on Solr. I start with the solr example 
rss, but i change that to has more fields, other fields and get the feed url 
from a database.

So each /rss/channel/item is a document to the indexing, bue each 
/rss/channel/item can have more than on enclosure tag.

Many thanks

On Jul 31, 2013, at 4:05 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 So you're trying to index a RSS feed as a single record, but you want to be
 able to search for and retrieve individual entries from within the feed? Is
 that the issue?
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 This fields can be multiValued.
 I the rss standart there is not correct to do that, but some sources do
 and i like to grab it all. Is there any way that make it possible?
 
 Once again, Many thanks :)
 
 On Jul 31, 2013, at 3:54 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 Luís,
 
 Is there a reason why splitting this up into enclosure_type,
 enclosure_url,
 and enclosure_length would not work?
 
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 Hi,
 
 I'm trying to index information of RSS Feeds.
 
 So in a more detailed explanation:
 
 The RSS feed has something like:
 enclosure url=
 http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3;
 length=32642192 type=audio/mpeg/
 
 *With my current configuration, this is working and i get a result like
 that:*
 
 
  - enclosure:
  [
 - audio/mpeg,
 - http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
 - 37521428
 ],
 
 
 *BUT,* this is not the result that i'm trying to reach. With that i'm
 not
 able to know in a correct way, if audio/mpeg is the *type*, or the *
 url,* or the *length*.
 *
 *
 *I want to reach something like:*
 
  -
  - enclosure:
  {
 - type: a http://www.gazzetta.it/udio/mpeg,
 - url:
 http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
 - length: 37521428
 },
 
 
 
 So, how i intend this, this should be 3 fields inside of another field,
 no?
 
 
 Many Thanks for the answer and the help.
 
 
 On Jul 31, 2013, at 3:34 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
 Nope. Solr fields are flat. Why do you want to do this? I'm
 asking because this might be an XY problems and there
 may be other possibilities.
 
 Best
 Erick
 
 On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
 meligalet...@gmail.com wrote:
 
 Hi, I'm trying to create a field with multiple fields inside, that is:
 
 origin:
 {
 
 htmlUrl: http://www.gazzetta.it/;,
 streamId: feed/http://www.gazzetta.it/rss/Home.xml;,
 title: Gazzetta.it
 
 },
 
 
 Get something like this. Is that possible? I'm using Solr 4.4.0.
 
 Thanks
 
 
 
 
 



smime.p7s
Description: S/MIME cryptographic signature


Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hum, ok.

It's possible to add to a field, static text? Text that i write on the 
configuration and then append another field? I saw something like 
CloneFieldProcessor but when i'm starting solr, it says that could not find the 
class.
I was trying to use processors to move one field to another.

I saw this:
processor class=solr.FieldCopyProcessorFactory
  str name=sourcelastname firstname/str
  str name=destfullname/str
  bool name=appendtrue/bool
  str name=append.delim, /str
/processor

But when i try to use it solr says that he cannot find the 
solr.FieldCopyProcessorFactory. I'm using solr 4.4.0

Thanks ;)

On Jul 31, 2013, at 4:16 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 OK,
 
 Then I would suggest creating multiValued enclosure_type, etc. tags for
 searching, and then one string-typed field to store the JSON snippet you've
 been showing.
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 As a single record? Hum, no.
 
 So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
 Each /rss/channel/item is a new document on Solr. I start with the solr
 example rss, but i change that to has more fields, other fields and get the
 feed url from a database.
 
 So each /rss/channel/item is a document to the indexing, bue each
 /rss/channel/item can have more than on enclosure tag.
 
 Many thanks
 
 On Jul 31, 2013, at 4:05 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 So you're trying to index a RSS feed as a single record, but you want to
 be
 able to search for and retrieve individual entries from within the feed?
 Is
 that the issue?
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 This fields can be multiValued.
 I the rss standart there is not correct to do that, but some sources do
 and i like to grab it all. Is there any way that make it possible?
 
 Once again, Many thanks :)
 
 On Jul 31, 2013, at 3:54 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 Luís,
 
 Is there a reason why splitting this up into enclosure_type,
 enclosure_url,
 and enclosure_length would not work?
 
 
 Michael Della Bitta
 
 Applications Developer
 
 o: +1 646 532 3062  | c: +1 917 477 7906
 
 appinions inc.
 
 “The Science of Influence Marketing”
 
 18 East 41st Street
 
 New York, NY 10017
 
 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 w: appinions.com http://www.appinions.com/
 
 
 On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso 
 meligalet...@gmail.com wrote:
 
 Hi,
 
 I'm trying to index information of RSS Feeds.
 
 So in a more detailed explanation:
 
 The RSS feed has something like:
 enclosure url=
 http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3;
 length=32642192 type=audio/mpeg/
 
 *With my current configuration, this is working and i get a result
 like
 that:*
 
 
 - enclosure:
 [
- audio/mpeg,
- http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
- 37521428
],
 
 
 *BUT,* this is not the result that i'm trying to reach. With that i'm
 not
 able to know in a correct way, if audio/mpeg is the *type*, or
 the *
 url,* or the *length*.
 *
 *
 *I want to reach something like:*
 
 -
 - enclosure:
 {
- type: a http://www.gazzetta.it/udio/mpeg,
- url:
http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
- length: 37521428
},
 
 
 
 So, how i intend this, this should be 3 fields inside of another
 field,
 no?
 
 
 Many Thanks for the answer and the help.
 
 
 On Jul 31, 2013, at 3:34 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
 Nope. Solr fields are flat. Why do you want to do this? I'm
 asking because this might be an XY problems and there
 may be other possibilities.
 
 Best
 Erick
 
 On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso
 meligalet...@gmail.com wrote:
 
 Hi, I'm trying to create a field with multiple fields inside, that is:
 
 origin:
 {
 
 htmlUrl: http://www.gazzetta.it/;,
 streamId: feed/http://www.gazzetta.it/rss/Home.xml;,
 title: Gazzetta.it

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Ok, thanks. I will check it.

On Jul 31, 2013, at 5:08 PM, Jack Krupansky j...@basetechnology.com wrote:

 See:
 https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
 
 I have more examples in my book.
 
 -- Jack Krupansky
 
 From: Luís Portela Afonso 
 Sent: Wednesday, July 31, 2013 11:41 AM
 To: solr-user@lucene.apache.org 
 Subject: Re: Solr PolyField
 
 Hum, ok. 
 
 It's possible to add to a field, static text? Text that i write on the 
 configuration and then append another field? I saw something like 
 CloneFieldProcessor but when i'm starting solr, it says that could not find 
 the class.
 I was trying to use processors to move one field to another.
 
 I saw this:
 processor class=solr.FieldCopyProcessorFactory
  str name=sourcelastname firstname/str
  str name=destfullname/str
  bool name=appendtrue/bool
  str name=append.delim, /str
 /processor
 But when i try to use it solr says that he cannot find the 
 solr.FieldCopyProcessorFactory. I'm using solr 4.4.0
 
 Thanks ;)
 
 On Jul 31, 2013, at 4:16 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:
 
 
  OK,
 
  Then I would suggest creating multiValued enclosure_type, etc. tags for
  searching, and then one string-typed field to store the JSON snippet you've
  been showing.
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062  | c: +1 917 477 7906
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  
 plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
  w: appinions.com http://www.appinions.com/
 
 
  On Wed, Jul 31, 2013 at 11:11 AM, Luís Portela Afonso 
  meligalet...@gmail.com wrote:
 
 
As a single record? Hum, no.
 
So an Rss has /rss/channel/ and then lot of /rss/channel/item, right?
Each /rss/channel/item is a new document on Solr. I start with the solr
example rss, but i change that to has more fields, other fields and get the
feed url from a database.
 
So each /rss/channel/item is a document to the indexing, bue each
/rss/channel/item can have more than on enclosure tag.
 
Many thanks
 
On Jul 31, 2013, at 4:05 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:
 
 
  So you're trying to index a RSS feed as a single record, but you want to
 
be
 
  able to search for and retrieve individual entries from within the feed?
 
Is
 
  that the issue?
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062  | c: +1 917 477 7906
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
 

 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Wed, Jul 31, 2013 at 10:59 AM, Luís Portela Afonso 
  meligalet...@gmail.com wrote:
 
 
This fields can be multiValued.
I the rss standart there is not correct to do that, but some sources do
and i like to grab it all. Is there any way that make it possible?
 
Once again, Many thanks :)
 
On Jul 31, 2013, at 3:54 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:
 
 
  Luís,
 
  Is there a reason why splitting this up into enclosure_type,
 
enclosure_url,
 
  and enclosure_length would not work?
 
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062  | c: +1 917 477 7906
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
 
 
 

 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Wed, Jul 31, 2013 at 10:43 AM, Luís Portela Afonso 
  meligalet...@gmail.com wrote:
 
 
Hi,
 
I'm trying to index information of RSS Feeds.
 
So in a more detailed explanation:
 
The RSS feed has something like:
enclosure url=
 
http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3;
 
length=32642192 type=audio/mpeg/
 
*With my current configuration, this is working and i get a result
 
like
 
that:*
 
 
- enclosure:
[
   - audio/mpeg,
   - http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
   - 37521428
   ],
 
 
*BUT,* this is not the result that i'm trying

Solr rss indexation doubt

2013-07-30 Thread Luís Portela Afonso
Hi,

I'm using Apache Solr to index RSS Feeds.
I'm with success getting data (url and if feed is active to index) from a 
database, and using that has a source of an entity to index the rss data.

I'm trying to reach a result but i don't get it. I will try to explain that 
with an example.

The RSS feed has something like: 
enclosure url=http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3; 
length=32642192 type=audio/mpeg/

In my Schema.xml:
dynamicField name=enclosure_* type=string indexed=false 
stored=false multiValued=true /

field name=enclosure type=text indexed=true stored=true 
multiValued=true /

copyField source=enclosure_* dest=enclosure /

In my data-config.xml:

dataSource name=sql-ds 
type=JdbcDataSource 
driver=com.mysql.jdbc.Driver 
url=jdbc:mysql://localhost/db 
user=user 
password=pass
readOnly=true/

dataSource name=url-ds type=URLDataSource /

document

entity name=src
rootEntity=false
dataSource=sql-ds
query=SELECT Feeds.IDFeed, Feeds.FeedUrl FROM meshapp.Feeds where 
Feeds.Active = 1

!-- Field created by MeshApp Reader to identify the source --
field column=IDFeed name=source-id /

entity name=xml
rootEntity=true 
dataSource=url-ds
url='${src.FeedUrl}'
onError=skip
processor=XPathEntityProcessor
forEach=/rss/channel/item | /rss/channel
transformer=DateFormatTransformer

!-- Lot of fields --

field column=enclosure_url 
xpath=/rss/channel/item/enclosure/@url /
field column=enclosure_length 
xpath=/rss/channel/item/enclosure/@length /
field column=enclosure_type 
xpath=/rss/channel/item/enclosure/@type /

/entity
/entity
/document

 This is working and i get the a result like that:

enclosure: [
audio/mpeg,
http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
37521428
],


BUT, this is not the result that i'm trying to reach. I want to reach something 
like:

enclosure: {
type: audio/mpeg,
url: http://www.engadget.com/podcasts/EngadgetHD_Podcast_359.mp3;,
length: 37521428
},


So, how i intend this, this should be 3 fields inside of another field, no?

I have try something like (doesn't work):

In my schema.xml (I think that this doesn't make sense):
field name=enclosure type=html indexed=false stored=true 
multiValued=true
dynamicField name=enclosure_* type=string indexed=false 
stored=true multiValued=true /
/field


In my data-config.xml:
field name=enclosure column=enclosure  --
field column=enclosure_url xpath=/rss/channel/item/enclosure/@url 
/
field column=enclosure_length 
xpath=/rss/channel/item/enclosure/@length /
field column=enclosure_type 
xpath=/rss/channel/item/enclosure/@type /
/field


Can you please help me?

Many Thanks,
Luís Portela Afonso








smime.p7s
Description: S/MIME cryptographic signature