ok great..... can I use this EntityProcessor within JdbcDataSource?
Like this: <dataConfig> <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/db_1" user="root" password="" autoCommit="true" /> <document> <entity name="table_1_fetch" query="SELECT field_1 FROM table_1 WHERE ('${dataimporter.request.clean}' != 'false' OR added_on > '${dataimporter.last_index_time}')"> <entity name="genesis_case_documents" query="SELECT original_document FROM case_documents WHERE case_md5 ='${genesis_case_info.case_md5}'"> </entity> <entity processor="PlainTextEntityProcessor" name="table_2_from_file_fetch" url="http://localhost/project_1/files/a.txt" dataSource="data-source-name"> <field column="plainText" name="text"/> </entity> By the way, I currently load the field into "text_en_splitting" as defined in schema.xml... On Mon, Jul 8, 2013 at 7:59 PM, Alexandre Rafalovitch <arafa...@gmail.com>wrote: > http://wiki.apache.org/solr/DataImportHandler#PlainTextEntityProcessor or > http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor ? > > The file name gets exposed as a ${entityname.fieldname} variable. You can > probably copy/manipulate it with a transformer on the external entity > before it hits an inner one. > > Regards, > Alex. > > Personal website: http://www.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all at > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > On Mon, Jul 8, 2013 at 10:42 AM, Raheel Hasan <raheelhasan....@gmail.com > >wrote: > > > On this page (http://wiki.apache.org/solr/DataImportHandler), I cant see > > how its possible. Perhaps there is another guide.. > > > > Basically, this is what I am doing: > > Index data from multiple tables into Solr (see here > > http://wiki.apache.org/solr/DIHQuickStart). I need to skip 1 very big > > heavy > > table as it only have 1 field that is a complete file. So I want to skip > > the step of loading that file per record into my RDB and then indexing > > it... Instead, I want to directly index that file with the rest of the > > records from coming from database... > > > > > > > > > > On Mon, Jul 8, 2013 at 7:30 PM, Alexandre Rafalovitch < > arafa...@gmail.com > > >wrote: > > > > > Did you have a chance to look at DIH with nested entities yet? That's > > > probably the way to go to start out. > > > > > > Or a custom client, of course. Or, ETL solutions that support Solr > (e.g. > > > Apache Flume - not personally tested yet). > > > > > > Regards, > > > Alex. > > > > > > Personal website: http://www.outerthoughts.com/ > > > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > > > - Time is the quality of nature that keeps events from happening all at > > > once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > > > > > > > On Mon, Jul 8, 2013 at 10:08 AM, Raheel Hasan < > raheelhasan....@gmail.com > > > >wrote: > > > > > > > Hi everyone, > > > > > > > > I am looking for a way to import/index data such that i load data > from > > > > table_1 and instead of joining from table_2, i import the rest of the > > > > "joined" data from a file instead. The name of the file comes from a > > > field > > > > from table_1.... > > > > > > > > Is it possible? and is it easily possible? > > > > > > > > -- > > > > Regards, > > > > Raheel Hasan > > > > > > > > > > > > > > > -- > > Regards, > > Raheel Hasan > > > -- Regards, Raheel Hasan