Re: [CentOS] UTF-8 support in PCRE
Amitava Shee wrote on Wed, 9 Jul 2008 13:27:35 -0400: PCRE in CentOS does not have unicode properties enabled. But that's different from what you claimed earlier! Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Amitava Shee wrote: The issue is in CentOS 5. I ran the application successfully in Ubuntu 8.04. PCRE in CentOS does not have unicode properties enabled. So it's not utf-8 support which is missing. Is there a way to enable these options (without the usual ./configure make)? Rebuild the src.rpm with the correct features enabled and/or file a bug upstream at http://bugzilla.redhat.com/. Ralph pgpS1upK7ZrRe.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
The issue is in CentOS 5. I ran the application successfully in Ubuntu 8.04. PCRE in CentOS does not have unicode properties enabled. Please see pcretest -C outputs from CentOS and Ubuntu CentOS 5 === [EMAIL PROTECTED] pcretest -C PCRE version 6.6 06-Feb-2006 Compiled with UTF-8 support No Unicode properties support Newline character is LF Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 1000 Default recursion depth limit = 1000 Match recursion uses stack Ubuntu = [EMAIL PROTECTED]:~$ pcretest -C PCRE version 7.4 2007-09-21 Compiled with UTF-8 support Unicode properties support Newline sequence is LF \R matches all Unicode newlines Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 1000 Default recursion depth limit = 1000 Match recursion uses stack Is there a way to enable these options (without the usual ./configure make)? -Amitava On Tue, Jul 8, 2008 at 6:44 AM, Ralph Angenendt [EMAIL PROTECTED][EMAIL PROTECTED] wrote: Amitava Shee wrote: Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with yum update. It would be nice to do something like yum install pcre-utf8 Again - and I'm going to type this very slowly: The supplied pcre which is *IN* CentOS *IS* built with UTF-8 support. And: Your problem has *nothing* to do with pcre, your problem lies *within* the iconv library. Ralph ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Amitava Shee wrote: Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with yum update. It would be nice to do something like yum install pcre-utf8 Again - and I'm going to type this very slowly: The supplied pcre which is *IN* CentOS *IS* built with UTF-8 support. And: Your problem has *nothing* to do with pcre, your problem lies *within* the iconv library. Ralph pgpRpw5Bym3HE.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
On Tue, Jul 8, 2008 at 6:44 AM, Ralph Angenendt [EMAIL PROTECTED] wrote: Okay kids, for those following along I'd like to take a moment to sum this thread up so far No it isn't Yes it is No it isn't Yes it is No it isn't Yes it is. Thank you. This has been a brief email summary. You may not return to your regularly scheduled insanity. -- During times of universal deceit, telling the truth becomes a revolutionary act. George Orwell ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
On Tue, Jul 08, 2008 at 08:23:59AM -0400, Jim Perrin enlightened us: Okay kids, for those following along I'd like to take a moment to sum this thread up so far No it isn't Yes it is No it isn't Yes it is No it isn't Yes it is. Thank you. This has been a brief email summary. You may not return to your regularly scheduled insanity. What should I do instead, if I can't return to insanity? Matt -- Matt Hyclak Systems and Operations Office of Information Technology Ohio University (740) 593-1222 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Matt Hyclak wrote on Tue, 8 Jul 2008 08:59:51 -0400: What should I do instead, if I can't return to insanity? go forward to it! Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
I tried to resist, but ... On Tue, 2008-07-08 at 18:31 +0200, Kai Schaetzl wrote: Matt Hyclak wrote on Tue, 8 Jul 2008 08:59:51 -0400: What should I do instead, if I can't return to insanity? What convinces you that you ever left it? Insane folks don't know they're insane. go forward to it! Kai ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
On Tue, Jul 8, 2008 at 5:23 AM, Jim Perrin [EMAIL PROTECTED] wrote: Okay kids, for those following along I'd like to take a moment to sum this thread up so far No it isn't Yes it is No it isn't Yes it is No it isn't Yes it is. Thank you. This has been a brief email summary. You may not return to your regularly scheduled insanity. We may NOT??? I happen to LIKE my regularly scheduled insanity - I need reality breaks from time to time. Gee, Jim, you really are a big meanie ;^) mhr ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
On Tue, Jul 8, 2008 at 9:44 AM, William L. Maltby [EMAIL PROTECTED] wrote: On Tue, 2008-07-08 at 18:31 +0200, Kai Schaetzl wrote: Matt Hyclak wrote on Tue, 8 Jul 2008 08:59:51 -0400: What should I do instead, if I can't return to insanity? What convinces you that you ever left it? Insane folks don't know they're insane. Oh, yes, we do - that's the difference between us and sane folks. Sane folks don't know that they are sane CNR. mhr ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Please see my reply inline below On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt [EMAIL PROTECTED][EMAIL PROTECTED] wrote: Amitava Shee wrote: How do I get utf-8 support with PCRE? I am having problems building lucene index using Zend_Lucene. I get the following error PHP Notice: iconv(): Detected an illegal character in input string in /var/www/ZendFramework-1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 56 a) What does that have to do with pcre? (which can do UTF-8) [Shee] Zend lucene search engine uses pcre and requires pcre to be compiled with --enable-utf8. Please see http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene.charset.utf_analyzer UTF-8 support can either be compiled into PCRE at build time or supported via shared library. But shared library support is included/excluded based on the distro. I believe, upstream RedHat does not include it. I was hoping to find a way in CentOS. I have no idea if other distro's support it. That's a research item for me. b) What is on line 56 in that file? Looks like iconv is choking on that. [Shee] Framework code - don't know much there So try to process that file with iconv on the command line. Ralph ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Amitava Shee wrote: On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt [EMAIL PROTECTED][EMAIL PROTECTED] wrote: Amitava Shee wrote: How do I get utf-8 support with PCRE? a) What does that have to do with pcre? (which can do UTF-8) [Shee] Zend lucene search engine uses pcre and requires pcre to be compiled with --enable-utf8. Please see http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene.charset.utf_analyzer UTF-8 support can either be compiled into PCRE at build time or supported via shared library. But shared library support is included/excluded based on the distro. I believe, upstream RedHat does not include it. I was hoping to find a way in CentOS. I have no idea if other distro's support it. That's a research item for me. As I said: pcre can do UTF-8: %build %configure --enable-utf8 That's from the spec file. And again: It's not pcre, it is iconv which doesn't like a character in one of the framework's files. Ralph pgp3aONAlq9h3.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with yum update. It would be nice to do something like yum install pcre-utf8 -Amitava On Mon, Jul 7, 2008 at 8:54 AM, Ralph Angenendt [EMAIL PROTECTED][EMAIL PROTECTED] wrote: Amitava Shee wrote: On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Amitava Shee wrote: How do I get utf-8 support with PCRE? a) What does that have to do with pcre? (which can do UTF-8) [Shee] Zend lucene search engine uses pcre and requires pcre to be compiled with --enable-utf8. Please see http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene.charset.utf_analyzer UTF-8 support can either be compiled into PCRE at build time or supported via shared library. But shared library support is included/excluded based on the distro. I believe, upstream RedHat does not include it. I was hoping to find a way in CentOS. I have no idea if other distro's support it. That's a research item for me. As I said: pcre can do UTF-8: %build %configure --enable-utf8 That's from the spec file. And again: It's not pcre, it is iconv which doesn't like a character in one of the framework's files. Ralph ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
On Mon, Jul 7, 2008 at 10:36 AM, Amitava Shee [EMAIL PROTECTED] wrote: Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with yum update. It would be nice to do something like yum install pcre-utf8 Okay, there's a disconnect, somewhere which you aren't getting. The pcre package included in centos does UTF8 just fine. The problem you are seeing is related to another package. You need to look at the script to see what iconv (where the problem actually is) is having problems with. -- During times of universal deceit, telling the truth becomes a revolutionary act. George Orwell ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
My error log with iconv is misleading. Please ignore that portion and instead use this little php script to check for utf-8 support in pcre ?php if (@preg_match('/\pL/u', 'a') == 1) { echo PCRE unicode support is turned on.\n; } else { echo PCRE unicode support is turned off.\n; } ? Also, please check out this thread (lack of pcre utf8 support in RHEL). http://marc.info/?l=php-i18nm=118303425505336w=2 -Amitava On Mon, Jul 7, 2008 at 10:45 AM, Jim Perrin [EMAIL PROTECTED] wrote: On Mon, Jul 7, 2008 at 10:36 AM, Amitava Shee [EMAIL PROTECTED] wrote: Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with yum update. It would be nice to do something like yum install pcre-utf8 Okay, there's a disconnect, somewhere which you aren't getting. The pcre package included in centos does UTF8 just fine. The problem you are seeing is related to another package. You need to look at the script to see what iconv (where the problem actually is) is having problems with. -- During times of universal deceit, telling the truth becomes a revolutionary act. George Orwell ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Please stop top posting. Thank you. mhr ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
sure thing g MHR wrote: Please stop top posting. Thank you. mhr ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] UTF-8 support in PCRE
Amitava Shee wrote: How do I get utf-8 support with PCRE? I am having problems building lucene index using Zend_Lucene. I get the following error PHP Notice: iconv(): Detected an illegal character in input string in /var/www/ZendFramework-1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 56 a) What does that have to do with pcre? (which can do UTF-8) b) What is on line 56 in that file? Looks like iconv is choking on that. So try to process that file with iconv on the command line. Ralph pgp6ULWuEqDEf.pgp Description: PGP signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos