On 2014/02/13 22:35, Petite Abeille wrote:
While we are at it, www.sqlite.org exhibits many validation errors:
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.sqlite.org%2F&charset=%28detect+automatically%29&doctype=Inline&group=0&user-agent=W3C_Validator%2F1.3+http%3A%2F%2Fvalidator.w3.org%2
On Feb 13, 2014, at 9:52 PM, Jan Nijtmans wrote:
> But if you put the validator in HTML5 mode, there are many less errors:
Possibly. But it says 'HTML 4.01 Strict' on the tin:
http://www.w3.org/TR/html4/strict.dtd”>
Either way, a bunch of errors.
2014-02-13 21:35 GMT+01:00 Petite Abeille :
>
> On Feb 13, 2014, at 9:08 PM, Petite Abeille wrote:
>
>> curl -s http://www.sqlite.org | lynx -nolist -stdin -dump
>
> While we are at it, www.sqlite.org exhibits many validation errors:
>
> http://validator.w3.org/check?uri=http%3A%2F%2Fwww.sqlite.or
On Feb 13, 2014, at 9:08 PM, Petite Abeille wrote:
> curl -s http://www.sqlite.org | lynx -nolist -stdin -dump
While we are at it, www.sqlite.org exhibits many validation errors:
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.sqlite.org%2F&charset=%28detect+automatically%29&doctype=Inline&
My current project needed to tokenize the text in HTML without the tags.
The easy solution for us was to license a library from Chilkat that
supported text extraction then tokenize that. I'm on my phone at the moment
but could supply more details later if desired.
SDR
On Feb 13, 2014 1:02 PM, "Dav
On Feb 13, 2014, at 8:48 PM, Wang, Baoping wrote:
> New to Sqlite, anybody knows is there a HTML tokenizer for full text search,
No.
> Or do I need to implement my own?
If you feel the urge. Otherwise, try lynx -dump.
For example:
curl -s http://www.sqlite.org | lynx -nolist -stdin -dump
_
> New to Sqlite, anybody knows is there a HTML tokenizer for full text search,
> Or do I need to implement my own?
There isn't an HTML tokeniser. But the default tokeniser considers punctuation
like <> to be word breaks so it may already work for you with the down side
that things like Hello! wi
New to Sqlite, anybody knows is there a HTML tokenizer for full text search,
Or do I need to implement my own?
Thanks
Pursuant to Treasury Regulations, any U.S. federal tax advice contained in this
communication, unless otherwise
stated, is not intended and cannot be used for the purpose of avo
8 matches
Mail list logo