AW: How to import data from Oracle to Solr

Wolfgang Schreiber Wed, 18 Jul 2012 02:50:33 -0700

Hi Karl,
hi ManifoldCF team members,


Using Solr's copyField element we managed to create separate fields for the
different database columns:

<field name="city" type="cityType" indexed="true" stored="true" /> 
...
<copyField source="text" dest="city"/>
...
<fieldType name="cityType" class="solr.TextField">
        <analyzer>
                <tokenizer class="solr.PatternTokenizerFactory" 
                 pattern=".+city:(.+);.*" group="1" />
        /analyzer>
</fieldType>    

Anyhow, this solution has some drawbacks; e.g. the newly created fields all
are text fields.
In particular numeric and date fields are also copied to text fields and we
cannot use type specific functions of Solr.

So coming back to the offer in your first mail: Is it possible that you
create a JDBC connector enhancement to support metadata?
Is there a special request process we must follow?

Best regards
Wolfgang 




-----Ursprüngliche Nachricht-----
Von: Karl Wright [mailto:[email protected]]
Gesendet: Di 17.07.2012 15:13
An: [email protected]
Betreff: Re: How to import data from Oracle to Solr
 
"So if I understand correctly ...

1) ... all mappings added to the "Solr Field Mapping" tab are ignored in case
of a JDBC resource connector?"

Not exactly - the mappings aren't ignored, there just isn't any
metadata associated with a JDBC connector document, so the mappings
never apply.

Regardless, I am glad you got the rest worked out.

Karl


On Tue, Jul 17, 2012 at 9:09 AM, Wolfgang Schreiber
<[email protected]> wrote:
> Hello Karl,
>
> thank you very much for your quick answer!
>
> So if I understand correctly ...
>
> 1) ... all mappings added to the "Solr Field Mapping" tab are ignored in
case
> of a JDBC resource connector?
>
> 2) Our data query must look somehow like (regarding that || is Oracle's
> concatenation operator):
>    SELECT ID AS "$(IDCOLUMN)", ADDRESS_URL AS "$(URLCOLUMN)",
>    'ZIP:' || ZIP || ';city:' || CITY || ';street:' || STREET
>    AS "$(DATACOLUMN)" FROM ADDRESS WHERE ID IN $(IDLIST)
>
>    This would result into DATACOLUMN values like:
>    ZIP:70173;City:Stuttgart;Street:Heilbronner
>
> We tried this statement and we got the data into the text field of our Solr
> index.
> It seems we are one step further!
>
> Thank you for your help! Best regards
> Wolfgang
>
>
> -----Ursprüngliche Nachricht-----
> Von: Karl Wright [mailto:[email protected]]
> Gesendet: Di 17.07.2012 12:42
> An: [email protected]
> Betreff: Re: How to import data from Oracle to Solr
>
> Hi Wolfgang,
>
> ManifoldCF is meant to handle a binary document and its metadata.  You
> must provide the document.  Metadata is optional.
>
> The JDBC connector does not currently support metadata.  In order to
> index this, therefore, you will need to decide what should go into
> your "binary document" from your database fields.  You can append
> together multiple fields into one document by means of SQL, e.g. the
> CONCAT operator or its Oracle equivalent.  This would go into one
> field in Solr, then, which is what you'd search on.
>
> Alternatively, if you really need separate indexed fields in Solr for
> search reasons, you can request a JDBC connector enhancement to add
> metadata support.  You'd still need a binary document, although you
> could return a blank value for that.
>
> So I guess the answer depends on what you are trying to do on the whole.
>
> Karl
>
>
> On Tue, Jul 17, 2012 at 6:27 AM, Wolfgang Schreiber
> <[email protected]> wrote:
>> Hello,
>>
>> we are trying to ingest data from an Oracle database into Solr.
>> We managed to insert docs into Solr but only document IDs are inserted and
> no
>> other data fields.
>>
>> Can you provide an example how to setup the import job in ManifoldCF ?
>>
>>
>> Assume we have the following initial situation:
>>
>> 1) Our Oracle table looks something like:
>>
>> ADDRESS
>> --------------------------
>> ID                      NUMBER
>> ZIP                     NUMBER
>> CITY                    VARCHAR(2)
>> STREET                  VARCHAR(2)
>>
>>
>> 2) In Solr's schema.xml we added the following fields for the database
>> columns
>> ...
>>         <field name="ZIP" type="int" indexed="true" stored="true" />
>>         <field name="City" type="string" indexed="true" stored="true" />
>>         <field name="Street" type="string" indexed="true" stored="true" />
>> ...
>>
>>
>> So here are our questions:
>>
>> * How do we have to setup the queries for the ManifoldCF job?
>>   In particular how exactly must the seeding query and the data query look
>> like?
>>
>> * How do the Solr field mappings look like?
>>
>>
>> We read your online documentation as well as your MEAP book but could not
>> find a workíng example for a successful import between Oracle and Solr.
>> Any help is welcome!
>>
>> Best regards
>> Wolfgang
>

AW: How to import data from Oracle to Solr

Reply via email to