I believe the end-user documentation talks about this to some extent. Nevertheless, the JDBC handler is designed to pull all the necessary information for a document, including the content data, out of a single database table. So it presumes the content is stored as either CLOB data or BLOB data in one column of the table.
The url field is necessary because that is what ManifoldCF uses for the "id" in the target search engine. It needs this to be able to remove or replace the document in the target on subsequent job runs. It might as well be a URL because it presumes that the search user will need some way to get to the content of the indexed document. Hope that answers your question. Karl 2011/7/29 Shinichiro Abe <[email protected]>: > Hello. > > I used JDBC Repository Connection and created > the following view table[1] on postgesql. > I set the default setting at Queries tab in job lists. > I run the job, then on the Solr, only urlfield was indexed as id field. > > 1)I also want to index datafield. What is needed to set? > Can I use it like solr dataimporthandler? > For example, can it index datafield1, datafield2, datafield…? > > 2)Why ingesting datafield need to know if url is valid in source code? > I want to index datafield without urlfield. > > My usage may be wrong, I assumed that string data of datafield is indexed as > contents. > I want to know what kind of table Data-query assume. > > [1]view:documenttable > | idfield | versionfield | urlfield | datafield > | modifydatefield > | char varying | char varying | char varying | char varying | bigint > -------------------------------------------------------------------------- > | 1 | 1 | file:///dummy/1| test > string | 1 > | 2 | 1 | file:///dummy/2| test > info | 1 > > Thank you, > Shinichiro Abe
