Author: Alexander Barkov Email: b...@mnogosearch.org Message: Hi Guillaume,
title, body and meta.description are not really needed to be in urlinfo for search purposes in 3.4.x. Search and search result presentation should work fine. But you might of course need them for some other external purposes, e.g. site analysis. The intention in the latest changes in 3.4.x was not to store sections in urlinfo by default, but they should be stored if the "length" parameter is set to non-zero. It seems something went wrong. I'll check it after the weekend (currently out of my development box). > Hi Again, > > I'm having a problem with some Section lines in the indexer.conf wit > mnogosearch 3.4.1 > > Here is an extract of my indexer.conf : > > Section ResponseTime 0 32 > # Standard sections: body, title > Section body 1 1024 > Section title 2 256 > > # HTML meta tags, e.g. <META NAME="KEYWORDS" CONTENT="xxxx"> > Section meta.keywords 3 > Section meta.description 4 256 > > # Incoming link text > Section ilinktext 5 128 > > # Document's URL part > Section url.file 6 0 > Section url.path 7 0 > Section url.host 8 0 > Section url.proto 9 0 > > # Useful meta information > Section Charset 10 32 > Section Content-Type 11 64 > Section Content-Language 12 16 > > # Message/rfc822 headers > #Section msg.from 15 > #Section msg.to 16 > #Section msg.subject 17 > > # A user defined section example. > # Extract text between <h1> and </h1> tags: > #Section h1 20 128 "<h1>(.*)</h1>" $1 > Section h1 26 256 "<h1[^>]*>(.*)</h1>" $1 > Section h2 26 256 "<h2[^>]*>(.*)</h2>" $1 > Section h3 26 256 "<h3[^>]*>(.*)</h3>" $1 > Section canonical 33 1024 '<link rel="canonical" > +href="([^"]*)"' $1 > Section ogdescription 33 300 '<meta property="og:description" > +content="([^"]*")' $1 > Section ogtitle 34 128 '<meta property="og:title" > +content="([^"]*")' $1 > > # Uncomment the following lines if you want index MP3 tags. > #Section MP3.Song 25 > #Section MP3.Album 26 > #Section MP3.Artist 27 > #Section MP3.Year 28 > > # HTTP headers, e.g. "Server" HTTP header > #Section header.server 30 > Section header 30 128 > Section header.server 30 128 > Section header.Date 30 128 > Section header.Last-Modified 30 128 > Section header.Etag 30 128 > Section header.X-Robots-Tag 30 128 > # HTML tag attributes > Section attribute.alt 35 128 > Section attribute.label 36 128 > Section attribute.summary 37 128 > Section attribute.title 38 128 > > ---- > > And after crawl, the only info saved in the urlinfo table are : > Canonical > Charset > Content-language > Content-type > h1 > h2 > h3 > ogdescription > ogtitle > ResponseTime > > As we can see various sections are missing, including some importants one as > Title and meta.description which I've checked exist in my server. > This results are the same for various documents and various servers. > > I've also tried to not set a length to title, body and meta.description as in > the 3.4 documentation example, but is doesn't work better. > > Did I miss something ? > > Thanks for the help, mnogosearch is a great tool ! > Reply: <http://www.mnogosearch.org/board/message.php?id=21747> _______________________________________________ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general