They can come from arbitrary sources that are out of my control. Therefore
i may not get the charset of the original document, so all i'm left with is
heuristic detection for those fragments. The application must be able to
deal with any XML it receives, it doesn't impose any particular structure
o
On Fri, May 26, 2017 at 4:12 AM, wrote:
> Still, sometimes XML fragments come up and even if they are not 100% XML
> spec compliant i still have to process them. This includes encoding
> detection as well, when the XML declaration is missing from the fragments.
>
Where do the fragments come fro
On Friday, May 26, 2017 at 10:01:18 AM UTC+3, Henri Sivonen wrote:
> > Think of XML files without the "encoding" attribute in the declaration or
> > HTML files without the meta charset tag.
>
> Per spec, these must be treated as UTF-16 if there's a UTF-16 BOM and
> as UTF-8 otherwise. It's highl
On Thu, May 25, 2017 at 10:44 PM, wrote:
> Think of XML files without the "encoding" attribute in the declaration or
> HTML files without the meta charset tag.
Per spec, these must be treated as UTF-16 if there's a UTF-16 BOM and
as UTF-8 otherwise. It's highly inappropriate to run heuristic
d
On Tuesday, May 23, 2017 at 7:47:12 PM UTC+3, Joshua Cranmer 🐧 wrote:
> On 5/23/17 2:58 AM, Gabriel Sandor wrote:
> > Hello Henri,
> >
> > I was afraid this might be the case, so the library really is deprecated.
> >
> > The project i'm working on implies multi-lingual environment, users, and
> > f
On 5/23/17 2:58 AM, Gabriel Sandor wrote:
Hello Henri,
I was afraid this might be the case, so the library really is deprecated.
The project i'm working on implies multi-lingual environment, users, and
files, so yes, having a good encoding detector is important. Thanks for the
alternate recomme
t; wrote:
> > I recently came across the Mozilla Charset Detectors tool, at
> > https://www-archive.mozilla.org/projects/intl/chardet.html. I'm working
> on
> > a C# project where I could use a port of this library (e.g.
> > https://github.com/errepi/ude) f
On Mon, May 22, 2017 at 12:13 PM, Gabriel Sandor
wrote:
> I recently came across the Mozilla Charset Detectors tool, at
> https://www-archive.mozilla.org/projects/intl/chardet.html. I'm working on
> a C# project where I could use a port of this library (e.g.
> https://github.co
On 22/05/2017 10:13, Gabriel Sandor wrote:
Greetings,
I recently came across the Mozilla Charset Detectors tool, at
https://www-archive.mozilla.org/projects/intl/chardet.html. I'm working on
a C# project where I could use a port of this library (e.g.
https://github.com/errepi/ude) for adv
Greetings,
I recently came across the Mozilla Charset Detectors tool, at
https://www-archive.mozilla.org/projects/intl/chardet.html. I'm working on
a C# project where I could use a port of this library (e.g.
https://github.com/errepi/ude) for advanced charset detection.
I'm not sure
10 matches
Mail list logo