Ok. That probably makes sense. I prefered not to delete the index.
I was confused becaue it works when you use regular /update handler  and
post and xml with the content of the document.

For some reason it doesn't work when using /update/extract handler. I guess
there is something there which doesn't
detect the document as the same when the old one doesn't have the _root_.

I am afraid I need to delte and reindex the collection.

Thanks guys
Sergio

On Mon, 19 May 2025 at 15:07, Mikhail Khludnev <m...@apache.org> wrote:

> Hello Sergio,
> I don't think that adding nested docs into indexes with standalones is ever
> supported or considered.
> Right.
> Please check bold here
>
> https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-nested-documents.html#schema-configuration
>
> On Mon, May 19, 2025 at 3:29 PM Sergio García Maroto <marot...@gmail.com>
> wrote:
>
> > Thnaks for you quick response. Let me elaborate a bit more.
> > There is an existing index I have in production where I reindex documents
> > all the time using the DocID unique field.
> > When a new request comes in using /update/extract that documents gets
> > reindexed and replace with new data. No problem with that.
> >
> > Once I added the _root_  and type_level fields to the schema old existing
> > document in Solr stays the same and a new document gets created.
> > If I reindex again the same document. The new one the one gets reindexed
> > and rewrite but the old one still there.
> >
> > I have the feeeling there is an issue with update/extrcat firs time you
> add
> > _root_ and type_level to the schema. It doesn't understand the old
> document
> > and the new one are the same.
> >
> > This forces me to delte index and do a reindexation for scatch. Or
> reindex
> > all documents and later at then end delte the old one don't have
> > type_level:parent.
> >
> > Any ideas on this?
> >
> >
> >
> >
> >
> > On Mon 19 May 2025 at 13:40, Ehrenleitner Robert Harald <
> > robert.ehrenleit...@plus.ac.at> wrote:
> >
> > > Hi,
> > >
> > > what exactly do you mean by "my document appears twice"? A document can
> > > appear a hundred times if all the entries differ only by the ID. Make
> > sure
> > > your indexer takes care of this. Also, your unique ID field is "DocID",
> > and
> > > according to your sample, its value seems to match ID. Make sure, it
> > always
> > > matches, otherwise it is handled like a compound primary key in a SQL
> > > database (actually, Solr's DB is a no-SQL database, but this only
> > concerns
> > > the way the data is queried).
> > >
> > > Also, make sure your query does not confuse parent ID and ID in some
> way.
> > > This could happen due to a bug in the querying application.
> > >
> > > Mag.phil. Robert Ehrenleitner, BEng.
> > > --
> > >
> > > Mag.phil. Robert Ehrenleitner, BEng.
> > >
> > > Web-Developer
> > >
> > > IT-Services | Application & Digitalization Services
> > >
> > > Hellbrunner Straße 34 | 5020 Salzburg | Austria
> > >
> > > Tel.: +43/(0)662/8044 - 6778
> > >
> > > *www.plus.ac.at <http://www.plus.ac.at>*
> > >
> > >
> > > ------------------------------
> > > *Von:* Sergio García Maroto <marot...@gmail.com>
> > > *Gesendet:* Montag, 19. Mai 2025 13:00
> > > *An:* solr-user <solr-u...@lucene.apache.org>
> > > *Betreff:* Using ExtractRequest handler to index documents using
> > > type_leve=parent
> > >
> > > Hi,
> > >
> > > I have been indexing documents for a long time usign /update/extract.
> > > Everyhting has been working well until I got a new requirement to add
> > > nested documents
> > >
> > > I added to schema.xml
> > > <field name="type_level" type="string" indexed="true" stored="true"
> > > docValues="true" /> <field name="_root_" type="string" indexed="true"
> > > stored
> > > ="true" multiValued="false" required="false" />
> > >
> > > My unique field
> > > <field name="DocID" type="string" indexed="true" stored="true" />
> > > <uniqueKey>DocID</uniqueKey>
> > >
> > > Ater doing this my reques to /update/extract to reindex the same
> document
> > > duplicates the document in SOlr.
> > > Here my request. I only changed the new parametes type_level:parent
> > >
> > > http://server:8983/solr/document/update/extract?
> > > literal.id=6584239&
> > > resource.name=&
> > > wt=xml&
> > > literal.DocID=6584239&
> > > literal.CoreID=6584239&
> > > literal.DocIsAttachToPNB=False&
> > > literal.DocAuthorID=1455&
> > > literal.DocIsAttachToPerson=True&
> > > literal.DocIsAttachToAssign=False&
> > > literal.DocIsAttachToCompany=False&
> > > literal.DocVersionID=4504527&
> > > literal.InsertDateSD=2011-01-03T07%3a51%3a00.0Z&
> > > literal.DocNameS=Squires+David+RES.doc&
> > > literal.DocCateNameS=Resume%2fCV&
> > > literal.DocAreaCateNameS=Person+Module&
> > > literal.type_level=parent&
> > >
> > >
> >
> stream.url=http%3a%2f%2flocalhost%3a8081%2f4%2f50%2f45%2fSquires%2520David%2520RES15EAC416-AF05-4D38-A4F9-7B489962C167.docx&
> > > overwrite=true&
> > > commit=true
> > >
> > > After this request the document appear duplicated. the only difference
> > > between the old and new one is type_level:parent.
> > >
> > > Anyone has any idea why this is happening.
> > >
> > > Regads,
> > > Sergio Maroto
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Reply via email to