Solr 4.1.0

We've been using the DIH to pull data in from a MySQL database for quite
some time now.  We're now wanting to strip all the HTML content out of many
fields using the HTMLStripTransformer (
http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer).
 Unfortunately, while it seems to be working fine for "top-level" entities,
we can't seem to get it to work for sub-entities:

(not exact schema, reduced for example purposes)

<entity name="blocks" dataSource="database"
transformer="HTMLStripTransformer" query="
  SELECT
    id as blockId,
    name as blockTitle,
    content as content
  FROM engagement_block
  ">
  <field column="content" stripHTML="true" />  *THIS WORKS!*
  <entity name="blockReplies" dataSource="database"
transformer="HTMLStripTransformer" query="
    SELECT
      br.other_content AS replyContent
    FROM block_reply
    ">
    <field column="other_content" stripHTML="true" /> *THIS DOESN'T WORK!*
  </entity>
</entity>

We've tried several different permutations of putting the sub-entity column
in different nest levels of the XML to no avail.  I'm curious if we're
trying something that is just not supported or whether we are just trying
the wrong things.

Thanks,
Andy Pickler

Reply via email to