I've just replace the line /include/ucharset.h
bool operator==(const CUWord& Word) const (line 236)
with
return (*w == 0) && (*w1 == 0);
but the make fails?
I cannot see the "bool CUWord::operator==(const CUWord& Word const" in the
code? I run Aspseek 1.1.1 devel...
You can just send mig the ucharset.h ?
Claus
-----Oprindelig meddelelse-----
Fra: Alexander F. Avdonkin [mailto:[EMAIL PROTECTED]]
Sendt: 27. marts 2001 09:31
Til: [EMAIL PROTECTED]
Emne: Re: [aseek-users] Problems with charsets/unicode
Yesterday I found my bug which can lead to improper indexing in unicode
version
Replace line with "return" statement in the method <bool
CUWord::operator==(const CUWord& Word) const> to
return (*w == 0) && (*w1 == 0);
(file: ucharset.h)
and reindex everything.
Alexander.
----- Original Message -----
From: "Claus Jul Larsen" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, March 27, 2001 3:04 PM
Subject: SV: [aseek-users] Problems with charsets/unicode
> Hi Alexander
>
> OK. Maybe it's the problem. I now uncommeted the charsetAlias - I hope it
> work again.
>
> I hope you can help me with this problem: Click on the link:
>
>
http://search.enovasion.dk/cgi-bin/s.cgi?q=bech&cs=iso-8859-1&ul=http%3A%2F%
>
2Fwww.experimentarium.dk%2Fdk%2F%25&tmpl=%2Fweb%2Fsearch.enovasion.dk%2Fcgi-
> bin%2Fs.htm
>
> See on result no 5, 7 and 8. There are some strangely codes i expcepts. It
> seems something from the C++ ???
>
> I seem the Aspseek team have a big support! Well!
>
> Claus
>
> -----Oprindelig meddelelse-----
> Fra: Alexander F. Avdonkin [mailto:[EMAIL PROTECTED]]
> Sendt: 26. marts 2001 15:43
> Til: [EMAIL PROTECTED]
> Emne: Re: [aseek-users] Problems with charsets/unicode
>
>
> So, do you want to say that it didn't word for URL with charset equal to
> ISO-8859-1 and worked with charset equal to iso-8859-1 ?
> Do you use CharsetAlias command in "aspseek.conf" and "searchd.conf" ?
>
>
> Alexander.
> ----- Original Message -----
> From: "Claus Jul Larsen" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Monday, March 26, 2001 9:24 PM
> Subject: SV: [aseek-users] Problems with charsets/unicode
>
>
> > It's was mostly
> >
> > ISO-8859-1
> > ISO88591
> >
> > Claus
> >
> > -----Oprindelig meddelelse-----
> > Fra: Alexander F. Avdonkin [mailto:[EMAIL PROTECTED]]
> > Sendt: 26. marts 2001 15:15
> > Til: [EMAIL PROTECTED]
> > Emne: Re: [aseek-users] Problems with charsets/unicode
> >
> >
> > This is strange because "index" puts charset found either in META or in
> > header to this field.
> > What kind of values did you in this field ?
> >
> > Alexander.
> >
> > ----- Original Message -----
> > From: "Claus Jul Larsen" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Monday, March 26, 2001 9:00 PM
> > Subject: SV: [aseek-users] Problems with charsets/unicode
> >
> >
> > > Now I caugth the reason of the problem. I urlwordsxx tables ->
> > charset-field
> > > - there must be iso-8859-1 - I changed the field in all rows and it's
> > > working now!
> > >
> > > How to put iso-8859-1 into the charset-field through the
index-process?
> > >
> > > Claus
> > >
> > > -----Oprindelig meddelelse-----
> > > Fra: [EMAIL PROTECTED]
> > > [mailto:[EMAIL PROTECTED]]På vegne af Claus Jul
Larsen
> > > Sendt: 26. marts 2001 11:49
> > > Til: '[EMAIL PROTECTED]'
> > > Emne: SV: [aseek-users] Problems with charsets/unicode
> > >
> > >
> > > I've tried to use the original ucharset.conf (copied from
> > > ucharset.conf-dist) .. but nothing helps... This problem is ONLY on
> danish
> > > chars when have non-HTML-entities on danish chars.
> > >
> > > In the db these danish chars is correct but s.cgi doesn't show
them.....
> > >
> > > I included this in both searchd.conf and aspseek.conf:
> > >
> > > # Charsets configuration
> > > Include charsets.conf
> > >
> > > # Unicode charsets configuration
> > > Include ucharset.conf
> > >
> > > # Stopwords configuration
> > > Include stopwords.conf
> > >
> > > Claus
> > >
> > > -----Oprindelig meddelelse-----
> > > Fra: Alexander F. Avdonkin [mailto:[EMAIL PROTECTED]]
> > > Sendt: 26. marts 2001 10:47
> > > Til: [EMAIL PROTECTED]
> > > Emne: Re: [aseek-users] Problems with charsets/unicode
> > >
> > >
> > > I've just tried to index URL containing this kind of danish chars and
> used
> > > cs=iso-8859-1 for search, everything seems to be OK.
> > > Didn't you remove directives CharsetTableU1 from "searchd.conf" and/or
> > > "aspseek.conf" or change files with charset definitions ?
> > > Try to use original "ucharset.conf".
> > >
> > > Alexander.
> > > ----- Original Message -----
> > > From: "Claus Jul Larsen" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Monday, March 26, 2001 3:45 PM
> > > Subject: SV: [aseek-users] Problems with charsets/unicode
> > >
> > >
> > > > I've seen a page with the problem and look at the html-source:
> > > >
> > > > <head>
> > > > <meta http-equiv="content-type"
> > > > content="text/html;charset=iso-8859-1">
> > > > <meta name="generator" content="Adobe GoLive 4">
> > > > <title>Experimentarium • Årsberetning
> > > > 1999</title>
> > > >
> > > > <meta http-equiv="pragma" content="no-casche">
> > > >
> > > > <link rel="stylesheet" href="/stylesheets/overlib.css"
> > > > type="text/css">
> > > >
> > > >
> > > > <style type="text/css"><!--
> > > > #overdiv { position: absolute; z-index: 1; top: 0px;
> > > > left: 0px; visibility: hidden }-->
> > > > </style>
> > > > </head>
> > > >
> > > >
> > > > It see OK?
> > > > If there are no <meta http-equiv="content-type"
> > > > content="text/html;charset=iso-8859-1"> in a html-page - I can set a
> > > default
> > > > charset?
> > > >
> > > > Claus
> > > >
> > > > -----Oprindelig meddelelse-----
> > > > Fra: Alexander F. Avdonkin [mailto:[EMAIL PROTECTED]]
> > > > Sendt: 26. marts 2001 09:37
> > > > Til: [EMAIL PROTECTED]
> > > > Emne: Re: [aseek-users] Problems with charsets/unicode
> > > >
> > > >
> > > > Check, if correct charset is set for those pages. Charset can be set
> by
> > > > <META http-equiv="Content-Type" content="text/html;
> > > charset=<charset-name>">
> > > > or returned in HTTP header.
> > > > If charset is not set, then ASPseek assumes that charset of those
> pages
> > is
> > > > "usascii" (7-bit).
> > > >
> > > > Alexander.
> > > >
> > > > ----- Original Message -----
> > > > From: "Claus Jul Larsen" <[EMAIL PROTECTED]>
> > > > To: <[EMAIL PROTECTED]>
> > > > Sent: Monday, March 26, 2001 3:29 PM
> > > > Subject: [aseek-users] Problems with charsets/unicode
> > > >
> > > >
> > > > > Hi
> > > > >
> > > > > I've a problem again with the danish charsets. I set the param
> > > > cs=iso-8859-1
> > > > > and its works well i excerpts with marking in bold.
> > > > >
> > > > > The danish chars i excerpts works well but it's only if these
> indexed
> > > > pages
> > > > > has HTML-entities as æ ø and more.... these are
> converted
> > > > > correcty to danish characters.
> > > > >
> > > > > BUT if there are pages which content danish chars but these are
not
> > > > > HTML-entities the excepts show:
> > > > >
> > > > > ...
> > > > > Future Body Den virtuelle debatbog Artikellisten ?bningstale Tale
> ved
> > > > > ?bningen af udstillingen Future Body af Erling Tiedemann, formand
> for
> > > Det
> > > > > Etiske R?d Deres kongelige h?jhed, mine damer og her
> > > > > ...
> > > > >
> > > > > Also these places of danish chars is only showed as '?' Why? The
> page
> > > > which
> > > > > content the excerpts haven't HTML-entities such as æ
ø
> > but
> > > æ
> > > > > and ø ....
> > > > >
> > > > > I run the unicode version of Aspseek and doesn't use the
> charsets-def
> > in
> > > > > templates because the unicode-version only use the unicodes.
> > > > >
> > > > > Maybe something wrong in the conf?
> > > > >
> > > > > Claus
> > > > >
> > > > > Med venlig hilsen
> > > > >
> > > > > Claus Jul Larsen
> > > > > System Developer
> > > > >
> > > > > _______________________________________________________________
> > > > >
> > > > > e|novasion
> > > > > Store Kongensgade 23A
> > > > > 1264 København K
> > > > >
> > > > > Teksttelefon: 77 31 20 10 (kald først til 70 11 44 11 og bed
> > > > > telefondamen om 77 31 20 10)
> > > > > Fax: 77 31 19 50
> > > > > E-mail: [EMAIL PROTECTED]
> > > > > Web: www.enovasion.dk
> > > > > _______________________________________________________________