the fact that it got committed in the end suggests there was no error in between

look at the status url and see the no:of rows returned etc.

It gives a clue as to what would have really happened. or you can
paste your dataconfig and status xmls and we may be able to suggest
something

On Thu, Nov 13, 2008 at 9:26 AM, Giri <[EMAIL PROTECTED]> wrote:
> Hi Noble,
>
> thanks for reply, my comments are below
>
>>>why is the id field multivalued?
> I was just trying various options, yes, this ID is unique, and I check for
> duplicates, when I did a distinct (id) query to the MySQL database, it
> returned almost 2 million.
>
>>> look at the status host:post/dataimport gives you the status
> I constantly checked the status  using the  dataimport URL,  the status was
> increased upto 600K records, then it stopped increasing, then took few
> minutes to commit the indexed data.
>
>
> On Tue, Nov 11, 2008 at 11:35 PM, Noble Paul നോബിള്‍ नोब्ळ् <
> [EMAIL PROTECTED]> wrote:
>
>> why is the id field multivalued? is there a uniqueKey in the schema ?
>> Are you sure there are no duplicates?
>>
>> look at the status host:post/dataimport gives you the status
>> it can give you some clue
>>
>> --Noble
>>
>>
>> On Wed, Nov 12, 2008 at 4:53 AM, Giri <[EMAIL PROTECTED]> wrote:
>> > Hi,
>> >
>> > I have about ~ 2 million records in a mySQL database table (about 9
>> fields
>> > from a single table), and I am trying to load it to the solr using
>> > DataImportHandler using the command=full-import option. it only indexed
>> > about 615360 records out of 2 millions.
>> >
>> > here is my db-data-config.xml
>> > <dataConfig>
>> >    <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
>> > url="jdbc:mysql://localhost:3306/mydb" user="ua" password="pw" batchSize
>> > ="-1"/>
>> >    <document name="climate">
>> >        <entity name="occurence" query="select * from mylargetable">
>> >            <field column="id" name="id" />
>> >            <field column="title" name="title" />
>> >            <field column="url" name="url" />
>> >         </entity>
>> >    </document>
>> > </dataConfig>
>> >
>> > and in my solr schema.xml, i define these fields as:
>> >
>> >    <field name="id" type="string" indexed="true" stored="true"
>> > multiValued="true"/>
>> >    <field name="title" type="text" indexed="true" stored="true"
>> > multiValued="true" required="false"/>
>> >    <field name="url" type="text" indexed="true" stored="true"
>> > multiValued="true" required="false"/>
>> >
>> >
>> > If I try to index just one field (id), then it indexes about 960000
>> records,
>> > but if I try to index all the above three fields, it indexes only 615360
>> > records.
>> >
>> > Any help will be appreciated.
>> >
>> > thanks!
>> >
>>
>>
>>
>> --
>> --Noble Paul
>>
>



-- 
--Noble Paul

Reply via email to