Re: Other document data.

Jack Krupansky Tue, 15 Jun 2010 08:34:09 -0700

For future reference, it would be nice to have an LCF logging option (debug?or maybe a separate class for the actual POST) to log the text of the HTTPPOST requests that were sent to Solr. Maybe display a max of 1,000 or 2,000of the actual content with a notation of how many additional bytes/charswere not dumped.


-- Jack Krupansky


--------------------------------------------------
From: <[email protected]>
Sent: Tuesday, June 15, 2010 5:23 AM
To: <[email protected]>
Subject: RE: Other document data.

Hi,

Yes I get it. Thanks for the clarification.
I was doing the similar thing before and it used to run, now it didn't. SoI got confused.
Is there any way to check if metadata is actually sent to solr? Because Iam experiencing some problem there and I don't seem to figure out where itis going wrong.
Thanks & Regards,
Rohan G Patil
Cognizant  Programmer Analyst Trainee,Bangalore || Mob # +91 9535577001
[email protected]

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Tuesday, June 15, 2010 2:13 PM
To: [email protected]
Subject: RE: Other document data.
LCF is an incremental crawler. The version query is used to determinewhether data needs to be refetched and reindexed. If it returns the samething each time the document is examined, the data query will not be runthe second time. I therefore suggest either the following:
(1) Supply no version query at all. That signals to the connector thatthere is no version information and the data must be reindexed on everyjob run.(2) Supply a version query that properly reflects changes to the data.For instance, if there's a timestamp in each record, you can use that byitself ONLY if any metadata changes also are associated with a change inthat timestamp. If not, you will need to glom the metadata into theversion string as well as the timestamp. Is this understood?
If you want to FORCE a reindex, there is a link in the crawler-ui for theoutput connection which allows you to force reindexing of all dataassociated with that connection.
If this still doesn't seem to describe what you are seeing, please clarifyfurther.
Thanks,
Karl

________________________________________
From: ext [email protected] [[email protected]]
Sent: Tuesday, June 15, 2010 12:51 AM
To: [email protected]
Subject: RE: Other document data.

Hi,
When we specify the metadata content, It runs fine the first time, Thesecond time it doesn't run the data query at all. What must be the problem?
Thanks & Regards,
Rohan G Patil
Cognizant  Programmer Analyst Trainee,Bangalore || Mob # +91 9535577001
[email protected]


-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Sunday, June 13, 2010 7:04 AM
To: [email protected]
Subject: RE: Other document data.
No. The data query (the same one that returns the blob info) can nowinclude additional columns. These columns will be sent to Solr asmetadata fields.
Karl

________________________________________
From: ext [email protected] [[email protected]]
Sent: Friday, June 11, 2010 2:28 AM
To: [email protected]
Subject: RE: Other document data.

Hi,

I see that the issue is resolved.

Now is there a new query where in we can specify the metadata fields ?

Thanks & Regards,
Rohan G Patil
Cognizant  Programmer Analyst Trainee,Bangalore || Mob # +91 9535577001
[email protected]


-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Thursday, June 10, 2010 4:12 PM
To: [email protected]
Subject: RE: Other document data.
It is not possible to properly glom other fields onto a BLOB unless youknow that the blob's contents are always encoded text. So I suggest youcreate a jira enhancement request in the Lucene Connector Frameworkproject to describe this enhancement (adding metadata support to JDBCconnector).
The url is: http://issues.apache.org/jira
You may need to create an account if you don't already have one. Let meknow if you have any difficulties.
Thanks,
Karl


-----Original Message-----
From: ext [email protected] [mailto:[email protected]]
Sent: Thursday, June 10, 2010 6:39 AM
To: [email protected]
Subject: RE: Other document data.


Hi,
Using solution 1 was not a bad idea, but the problem is the content isstored as BLOB in the database and gluing other fields with BLOB is notpossible (Is it ?) .
Regarding 2 : Yes I guess I can do that modification, and anyway it alldepends on how we show it to the user.
Thanks & Regards,
Rohan G Patil
Cognizant  Programmer Analyst Trainee,Bangalore || Mob # +91 9535577001
[email protected]

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Thursday, June 10, 2010 3:19 PM
To: [email protected]
Subject: RE: Other document data.
(1) The JDBC connector is currently relatively primitive and does not haveany support for "document metadata" at this time. You can, of course,glom together multiple fields into the content field with it, but that'spretty crude.(2) The LCF convention for how to identify documents uniquely in thetarget index is to use the URL of the document. All documents indexedwith LCF have such a URL and it is likely to be both useful and unique.This url is how LCF requests deletion of the document from the index, ifnecessary, and also overwrites the document. So it maps pretty preciselyto literal.id for the basic solr setup. Now, it may be that this is tootied to the example, and that the solr connector should have aconfiguration setting to allow the name of the id field used to bechanged - that sounds like a reasonable modification that would not be toodifficult to do. Is this something you are looking for?
Karl
________________________________________
From: ext [email protected] [[email protected]]
Sent: Thursday, June 10, 2010 4:52 AM
To: [email protected]
Subject: Other document data.

I am using JDBC connection to search for the documents in the database.
The issue is some document data(Check in date etc) is present in theother columns. How to send this data to Solr so as to index it.
Why is the URL of the file taken as ID in Solr.

Thanks & Regards,
Rohan G Patil
Cognizant  Programmer Analyst Trainee,Bangalore || Mob # +91 9535577001
[email protected]<mailto:[email protected]>

This e-mail and any files transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential and privileged
information.
If you are not the intended recipient, please contact the sender by
reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding,
printing or copying of this email or any action taken in reliance on this
e-mail is strictly prohibited and may be unlawful.




This e-mail and any files transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential and privileged
information.
If you are not the intended recipient, please contact the sender by
reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding,
printing or copying of this email or any action taken in reliance on this
e-mail is strictly prohibited and may be unlawful.



This e-mail and any files transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential and privileged
information.
If you are not the intended recipient, please contact the sender by
reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding,
printing or copying of this email or any action taken in reliance on this
e-mail is strictly prohibited and may be unlawful.



This e-mail and any files transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential and privileged
information.
If you are not the intended recipient, please contact the sender by
reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding,
printing or copying of this email or any action taken in reliance on this
e-mail is strictly prohibited and may be unlawful.



This e-mail and any files transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential and privileged
information.
If you are not the intended recipient, please contact the sender by
reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding,
printing or copying of this email or any action taken in reliance on this
e-mail is strictly prohibited and may be unlawful.

Re: Other document data.

Reply via email to