Re: Character encoding problem in FilterParserImpl (AD version < 1.5)

Norval Hope Thu, 07 Aug 2008 04:55:50 -0700

Hi Emanuel,

Thanks for the feedback; I knew I was trying my luck a bit but thought
I'd better check if anyone happened to remember any relevant history.
I'm getting reasonably hopeful that I can backport the new
FilterParser with a bit of creative reactive refactoring ...


Once I get this current firefight I think I'll finally get onto the
job of some serious resyncing with a more current version of AD. I
know there's lots for me to get my head around, not just coming to
terms with the new stuff but also working out how to apply my
customisations / work out if they're still neccessary etc ...

Cheers

On Wed, Aug 6, 2008 at 6:48 PM, Emmanuel Lécharny <[EMAIL PROTECTED]> wrote:
> Hi Norval,
>
> the thing is that the antlr based filter parser has been totally rewritten
> in 1.5, and replaced by a hand crafted parser, so it's difficult to say if
> the previous version correctly handle UTF-8 chars in any case, but a blind
> guess is that it may be buggy.
>
> Now, to be frank, I don't think we will spend some time fixing 1.0 version
> of the server, as we are dedicated to get 2.0 out as soon as possible. 1.0
> is almost a dead branch... That does not mean committers can't fix it, if
> needed ! We can even release it, but I would say : it's up to you !
>
> Don't get me wrong : I'm not saying that you are on your own, and we don't
> want to help you, it's just that, eh, we don't have time for 1.0 anymore, as
> it's already really hard to find time to fix urgent bugs in 1.5 ! Hopefully,
> as soon as we are done with some big refactoring we are currently doing for
> months in a branch, we will be able to get back to work on trunk and fix
> those urgent bugs...
>
> Thanks !
>
> Norval Hope wrote:
>>
>> Hi All,
>>
>> I'm using a customized version of ApacheDS 1.5 and have run into a
>> problem with the filter parser. I know this has been much improved in
>> more recent versions of AD but I'm not able to ugrade at this moment
>> (however, it seems I'll be able to start embarking on the process of
>> resyncing with a more recent build in the next two or three weeks). I
>> got excited when I saw a naked getBytes() call in the code from
>> FilterParserImpl below:
>>
>>
>>    public synchronized ExprNode parse( String filter ) throws
>> ParseException, IOException
>>    {
>>        ExprNode root = null;
>>
>>        if ( filter == null || filter.trim().equals( "" ) )
>>        {
>>            return null;
>>        }
>>
>>        if ( filter.indexOf( "**" ) > -1 )
>>        {
>>            filter = StringTools.trimConsecutiveToOne( filter, '*' );
>>        }
>>
>>        this.parserPipe.write( filter.getBytes("UTF-8") );   //
>> *******************
>>        this.parserPipe.write( '\n' );
>>        this.parserPipe.flush();
>>
>> and added the "UTF-8" thinking that would sort out my problem. This
>> improved (or at least changed) the situation as the multi-byte chinese
>> character I passed in to the filter expression no longer came out as a
>> '?' but rather as different, but incorrect, character.
>>
>> Given I don't know anything about the ANTLR generated code sitting on
>> the other end of the pipe I was hoping someone more knowledgeable
>> might be able to cast their minds back and offer some clues about:
>>  a) whether my suspicion that the ANTLR code is expecting UTF-8 is
>> accurate
>>  b) whether there is anyway I might be able to tweak the ANTLR code
>> or Maven build process so that multi-byte characters appear correctly
>> in the parse tree.
>>
>> Many thanks
>>
>>
>
>
> --
> --
> cordialement, regards,
> Emmanuel Lécharny
> www.nextury.com
> directory.apache.org
>
>
>

Re: Character encoding problem in FilterParserImpl (AD version < 1.5)

Reply via email to