Not a bug with current RSS connector, but something probably important... 

Current RSS connector  uses XMLFileContext for temporary XML(?), and here
problems may happen if <description> and <content> contain sub-elements...
but in our specific use case it is HTML snippet, and we don't consider it
XML, so that unescaped characters are natural...

So I think there are no any problems (with current RSS specs), but we might
have problems in the future with another use cases such as:
<description>
        <sub-description-1>     &lt;H1&gt;Header </sub-description-1>
</description>


Output to temp. file will be malformed XML:
        <sub-description-1>     <H1> Header </sub-description-1>






-----Original Message-----
From: Karl Wright [mailto:[email protected]] 
Sent: March-24-11 10:26 PM
To: [email protected]
Subject: Re: XMLWriterContext: tagContext doesn't escape chars

Ok, although I am curious whether this is a bug with a current connector?
Or is this something new you were trying to do?

Karl

On Thu, Mar 24, 2011 at 10:21 PM, Fuad Efendi <[email protected]> wrote:
> Hi Karl, I think initial message was improperly (re)formatted... I 
> suspect connector-user allows HTML, and connector-dev allows only plain
text.
>
> The class XMLWriterContext, method tagContents(char[] ch, int start, 
> int
> length) should escape special characters before writing to Writer...
> beginTag and endTag already do that; obviously this class is needed to 
> output XML.
> Fortunately it is easy to extend this class in "connector" plugin and 
> override this method.
>
>
>  /** This method is meant to be extended by classes that extend this 
> class */
>  protected void tagContents(char[] ch, int start, int length)
>    throws ManifoldCFException
>  {
>    try
>    {
>      theWriter.write(ch,start,length);
>    }
>    catch (java.net.SocketTimeoutException e) ... ... ...
>
>
> -Fuad
>
>
>
>
>
>
> -----Original Message-----
> From: Karl Wright [mailto:[email protected]]
> Sent: March-24-11 10:10 PM
> To: [email protected]
> Subject: Re: XMLWriterContext: tagContext doesn't escape chars
>
> Could you resend your previous message?  I don't think it made it 
> through; perhaps you were not signed up for the list at that point.
> This is the first message of this thread that was posted.
>
> Thanks,
> Karl
>
> On Thu, Mar 24, 2011 at 7:22 PM, Fuad Efendi <[email protected]> wrote:
>> I just found it.
>>
>>
>>
>>  /** This method is meant to be extended by classes that extend this 
>> class */
>>
>>  protected void tagContents(char[] ch, int start, int length)
>>
>>    throws ManifoldCFException
>>
>>  {
>>
>>    try
>>
>>    {
>>
>>      theWriter.write(ch,start,length);
>>
>>    }
>>
>>    catch (java.net.SocketTimeoutException e)
>>
>> ...
>>
>>
>>
>>
>>
>> And we are using temp files with RSS connector.
>>
>>
>>
>>
>>
>> I tried to split big feed on "entities", stored as an XML Documents, 
>> but I found some XML-escaped characters will be unescaped (for 
>> instance, RSS may contain HTML snippet as a value of an element)
>>
>>
>>
>>
>>
>> -Fuad
>>
>>
>>
>>
>>
>>
>>
>>
>
>

Reply via email to