Re: escaping HTML tags within XML file

2011-09-25 Thread pulkitsinghal
Yes sir! Sent from my iPhone On Sep 25, 2011, at 4:06 PM, okayndc wrote: > Here is a representation of the XML file... > > > > Text hereMore text > here > > > > I want to keep the HTML tags because it keeps the formatting (paragraph > tags, etc) intact for the output. Seems like you'

Re: escaping HTML tags within XML file

2011-09-25 Thread Michael Sokolov
Yes - you can index HTML text only while keeping the tags in place in the stored field using HTMLCharFilter (or possibly XMLCharFilter). But you will find that embedding HTML inside XML can be problematic since HTML tags don't have to follow the well-formed constraints that XML requires. For

Re: escaping HTML tags within XML file

2011-09-25 Thread okayndc
Here is a representation of the XML file... Text hereMore text here I want to keep the HTML tags because it keeps the formatting (paragraph tags, etc) intact for the output. Seems like you're saying that the HTML can be kept intact with the use of a HTML field type without having to esca

Re: escaping HTML tags within XML file

2011-09-25 Thread pulkitsinghal
Assuming that the XML has the HTML as values inside fully formed tags like so: then I think that using the "HTML" field type in schema.xml for indexing/storing will allow you to do meaningful searches on the content of the HTML without getting confused by the HTML syntax itself. If you have abs

escaping HTML tags within XML file

2011-09-25 Thread okayndc
Hello, Was wondering if it is necessary to escape HTML tags within an XML file for indexing? If so, seems like a large XML files with tons of HTML tags could get really messy (using CDATA). Has this been your experience? Do you escape the HTML tags? If so, what technique do you use? Or do you le