[ 
https://issues.apache.org/jira/browse/SOLR-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493870#comment-13493870
 ] 

zakaria benzidalmal commented on SOLR-2549:
-------------------------------------------

data config example:

<dataConfig>
        <dataSource name="URL" 
baseUrl="file:///c:/work/solr/example/example-DIH/solr/csv/in/" 
type="URLDataSource" />
        <document name="FixedWidthCounts">
                
                <!-- for delimited files -->
                <entity
                        name="sites"
                        
processor="org.apache.solr.handler.dataimport.LineEntityProcessor"
                        dataSource="URL"
                        url="data.csv"
                        header="true"
                        separator=","
                        ... <!-- you can specify here other updatecsv request 
handler parameters -->                                            
                />

                <!-- for fixed-width files -->
                <entity
                        name="sites"
                        
processor="org.apache.solr.handler.dataimport.LineEntityProcessor"
                        dataSource="URL"
                        url="data.csv"
                        colDef1="ID,0,6,STRING,0,LEFT"
                        colDef2="NAME,6,26,STRING,0,LEFT"
                        ...
                />


        </document>
</dataConfig>
                
> DIH LineEntityProcessor support for delimited & fixed-width files
> -----------------------------------------------------------------
>
>                 Key: SOLR-2549
>                 URL: https://issues.apache.org/jira/browse/SOLR-2549
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0-ALPHA
>            Reporter: James Dyer
>            Priority: Minor
>         Attachments: SOLR-2549.patch, SOLR-2549.patch, SOLR-2549.patch, 
> SOLR-2549.patch
>
>
> Provides support for Fixed Width and Delimited Files without needing to write 
> a Transformer. 
> The following xml properties are supported with this version of 
> LineEntityProcessor:
> For fixed width files:
>  - colDef[#]
> For Delimited files:
>  - fieldDelimiterRegex
>  - firstLineHasFieldnames
>  - delimitedFieldNames
>  - delimitedFieldTypes
> These properties are described in the api documentation.  See patch.
> When combined with the cache improvements from SOLR-2382 this allows you to 
> join a flat file entity with other entities (sql, etc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to