Shalin
Downloaded nightly for 21jan and tried DIH again. Its better but
still broken. Dozens of embeded tags are stripped from documents
but it now fails every few documents for no reason I can see. Manually
removing embeded tags causes a given problem document to be indexed,
only to have a it
Hi Fergus,
It seems a field it is expecting is missing from the XML.
field column=fileAbsPath template=${jcurrent.fileAbsolutePath} /
field column=fileWebPath regex=/Volumes/spare/ts/(.*) replaceWith=$1
sourceColName=*fileAbsePath*/
I guess fileAbsePath is a typo? Can you check if that is the
Hi Fergus,
It seems a field it is expecting is missing from the XML.
You mean there is some field in the document we are indexing
that is missing?
field column=fileAbsPath template=${jcurrent.fileAbsolutePath} /
field column=fileWebPath regex=/Volumes/spare/ts/(.*) replaceWith=$1
Hello all,
I have the following DIH data-config.xml file. Adding
HTMLStripTransformer and the associated stripHTML on the
para tag seems to have broke things. I am using a nightly
build from 12-jan-2009
The /record/sect1/para contains HTML sub tags which need
to be discarded. Is my use of
This looks fine. Can you post the stack trace?
Yep, here is the juicy bit. Let me know if you need more.
Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 2390 ms
Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute
INFO: [janesdocs]
Hmmm,
Just to clarify I retested the thing using the nightly as of today
18-jan-2009. The problem is still there and this traceback is from
that nightly.
This looks fine. Can you post the stack trace?
Yep, here is the juicy bit. Let me know if you need more.
Jan 19, 2009 11:08:03 AM
Ah, it needs a null check for multi valued fields. I've committed a fix to
trunk. The next nightly build should have it. You can checkout and build
from the trunk if need this immediately.
On Mon, Jan 19, 2009 at 7:02 PM, Fergus McMenemie fer...@twig.me.uk wrote:
Hmmm,
Just to clarify I