[
https://issues.apache.org/jira/browse/SOLR-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erik Hatcher reassigned SOLR-7057:
----------------------------------
Assignee: Erik Hatcher
> SimplePostTool curbside appeal
> ------------------------------
>
> Key: SOLR-7057
> URL: https://issues.apache.org/jira/browse/SOLR-7057
> Project: Solr
> Issue Type: Improvement
> Components: SimplePostTool
> Reporter: Timothy Potter
> Assignee: Erik Hatcher
> Priority: Minor
>
> When trying to index some Freebase articles, such as:
> http://maven.tamingtext.com/freebase-wex-2011-01-18-articles-first10k.tsv
> using the SimplePostTool (bin/post), I ran into a few minor things along the
> way that would help new users trying to get their content indexed.
> First, I tried the naive approach:
> {code}
> $ bin/post -c freebase ./freebase-wex-2011-01-18-articles-first10k.tsv
> {code}
> Didn't work ... here's the output:
> {code}
> SimplePostTool: WARNING: Skipping
> freebase-wex-2011-01-18-articles-first10k.tsv. Unsupported file type for auto
> mode.
> 1 files indexed.
> {code}
> Ummm ... no, 1 files not indexed ;-) Instead the output should be something
> like:
> {code}
> SimplePostTool: WARNING: Skipping
> freebase-wex-2011-01-18-articles-first10k.tsv. Unsupported file type for auto
> mode.
> 0 of 1 files indexed.
> {code}
> Besides the misleading output, shouldn't tsv be a supported file type for
> auto-mode? It's a common enough format ...
> So I renamed the file to .csv instead and re-ran ... this time I get:
> {code}
> $ mv freebase-wex-2011-01-18-articles-first10k.tsv
> freebase-wex-2011-01-18-articles-first10k.csv
> $ bin/post -c freebase ./freebase-wex-2011-01-18-articles-first10k.csv
> ERROR - 2015-01-28 16:24:16.074; org.apache.solr.common.SolrException;
> org.apache.solr.common.SolrException: CSVLoader: input=null, line=1,expected
> 108 values but got 4
> {code}
> Hmmm ... OK ... did a little Googling and discovered I needed to specify the
> separator to be %09 (again, the tool should just recognize TSV as a supported
> format)
> {code}
> bin/post -c freebase -params "separator=%09&escape=\\"
> ./freebase-wex-2011-01-18-articles-first10k.csv
> {code}
> Success! (of course I had to add a header line to the file too, but there's
> little we can do about that)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]