Replying to my own post, I just tried with solr 1.2 with the last 2 previous versions of acts_as_solr and it worked great, so I'm pretty sure this is a solr-ruby issue. I'll do some more testing with the way solr-ruby adds documents to Solr.
-- Thiago Jackiw acts_as_solr => http://acts-as-solr.railsfreaks.com On 6/19/07, Thiago Jackiw <[EMAIL PROTECTED]> wrote:
What's interesting is that on the previous versions of acts_as_solr (without solr-ruby) the html entities where getting indexed fine without passing through ERB's html_escape method. That's that I did as a fast fix before starting this thread. Did anything change in Solr 1.2 in regards to xml parsing? And I guess I should try the previous version of the acts_as_solr plugin with Solr 1.2 to see if I get the same error. -- Thiago Jackiw acts_as_solr => http://acts-as-solr.railsfreaks.com On 6/19/07, Aaron Suggs <[EMAIL PROTECTED]> wrote: > I'm was getting the same XmlPullParserException from solr while using > solr-ruby to index HTML. > > I solved things by running text through the html_escape() method in > ERB::Utils before submitting to Solr. > > In the console, the following generates the XmlPullParserException in > solr, which manifests itself as a Net::HTTPFatalError in solr-ruby: > > Solr::Connection.new(http://localhost:8083/solr, :autocommit => > :on).add(:id => 1, :value_t => ' ') > Net::HTTPFatalError: 500...XmlPullParserException... > > But escape_html (aliased as the h() method by default) characters > works like a charm: > > include ERB::Util > Solr::Connection.new(http://localhost:8083/solr, :autocommit => > :on).add(:id => 1, :value_t => h(' ')) > => true > > Subsequently, searching for strings like 'nbsp' returns hits on those > escaped entities, which may or may not be what you want: > >> Solr::Connection.new(SOLR_URL, :autocommit => :on).query('value_t:nbsp').hits > => [{"score"=>10.771498, "id"=>1, "value_t"=>" "}] > > If you don't want searches for 'nbsp' to return all documents with > escaped non-breaking spaces, the solution lies in defining some new > fieldtype in solr/conf/schema.xml > > -Aaron Suggs > > On 6/19/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > On 6/19/07, Thiago Jackiw <[EMAIL PROTECTED]> wrote: > > > There's something funky with solr-ruby's xml processing when adding > > > documents, but I don't really know what it is yet. It can't process > > > html entities at all, not even an html blank space " ": > > > > nbsp is not a default XML entity. > > Try replacing it with   > > > > -Yonik > > >
