One very important thing I forgot to mention is that you will have to
increase the JAVA heap size for larger data sets.

Set JAVA_OPT to something acceptable.

Adam

On Thu, Dec 16, 2010 at 3:27 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Thu, Dec 16, 2010 at 3:06 PM, Dennis Gearon <gear...@sbcglobal.net>
> wrote:
> > That easy, huh? Heck, this gets better and better.
> >
> > BTW, how about escaping?
>
> The CSV escaping?  It's configurable to allow for loading different
> CSV dialects.
>
> http://wiki.apache.org/solr/UpdateCSV
>
> By default it uses double quote encapsulation, like excel would.
> The bottom of the wiki page shows how to configure tab separators and
> backslash escaping like MySQL produces by default.
>
> -Yonik
> http://www.lucidimagination.com
>
>
> >
> >  Dennis Gearon
> >
> >
> > Signature Warning
> > ----------------
> > It is always a good idea to learn from your own mistakes. It is usually a
> better
> > idea to learn from others’ mistakes, so you do not have to make them
> yourself.
> > from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
> >
> >
> > EARTH has a Right To Life,
> > otherwise we all die.
> >
> >
> >
> > ----- Original Message ----
> > From: Adam Estrada <estrada.adam.gro...@gmail.com>
> > To: Dennis Gearon <gear...@sbcglobal.net>; solr-user@lucene.apache.org
> > Sent: Thu, December 16, 2010 10:58:47 AM
> > Subject: Re: bulk commits
> >
> > This is how I import a lot of data from a cvs file. There are close to
> 100k
> > records in there. Note that you can either pre-define the column names
> using
> > the fieldnames param like I did here *or* include header=true which will
> > automatically pick up the column header if your file has it.
> >
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,population,elevation,gtopo30,timezone,modificationdate,cat&stream.file=C
> >
> >
> :\tmp\cities1000.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> >
> > This seems to load everything in to some kind of temporary location
> before
> > it's actually committed. If something goes wrong there is a rollback
> feature
> > that will undo anything that happened before the commit.
> >
> > As far as batching a bunch of files, I copied and pasted the following in
> to
> > Cygwin and it worked just fine.
> >
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,population,elevation,gtopo30,timezone,modificationdate,cat&stream.file=C
> >
> >
> :\tmp\cities1000.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xab.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xac.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xad.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xae.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xaf.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xag.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xah.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xai.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xaj.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xak.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xal.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xam.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xan.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xao.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl "
> >
> http://localhost:8983/solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,latitude,longitude,featureclass,featurecode,countrycode,admin1code,admin2code,admin3code,admin4code,population,elevation,gtopo30,timezone,modificationdate&stream.file=C
> >
> > :\tmp\xap.csv&overwrite=true&stream.contentType=text/plain;charset=utf-8"
> > curl http://localhost:8983/solr/update -H "Content-Type: text/xml"
> > --data-binary '<optimize/>'
> >
> > Adam
> >
> > On Thu, Dec 16, 2010 at 1:44 PM, Dennis Gearon <gear...@sbcglobal.net
> >wrote:
> >
> >> Might be Csv or tab delimited text.
> >>
> >> Sent from Yahoo! Mail on Android
> >>
> >>  ------------------------------
> >> * From: * Adam Estrada <estrada.adam.gro...@gmail.com>;
> >> * To: * <solr-user@lucene.apache.org>;
> >> * Subject: * Re: bulk commits
> >> * Sent: * Thu, Dec 16, 2010 6:35:17 PM
> >>
> >>   what is it that you are trying to commit?
> >>
> >> a
> >>
> >> On Thu, Dec 16, 2010 at 1:03 PM, Dennis Gearon <gear...@sbcglobal.net
> >> >wrote:
> >>
> >> > What have people found as the best way to do bulk commits either from
> the
> >> > web or
> >> > from a file on the system?
> >> >
> >> >  Dennis Gearon
> >> >
> >> >
> >> > Signature Warning
> >> > ----------------
> >> > It is always a good idea to learn from your own mistakes. It is
> usually a
> >> > better
> >> > idea to learn from others’ mistakes, so you do not have to make them
> >> > yourself.
> >> > from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
> >> >
> >> >
> >> > EARTH has a Right To Life,
> >> > otherwise we all die.
> >> >
> >> >
> >>
> >
> >
>

Reply via email to