Re: [Wikitech-l] leading space and tag

2019-07-22 Thread Subramanya Sastry

On 7/22/19 11:05 AM, Subramanya Sastry wrote:


On 7/22/19 10:51 AM, Arlo Breault wrote:

On Jul 22, 2019, at 5:11 AM, Sergey F  wrote:

test2
  test3


The result of conversion is:

test2
test3


Yes, this looks like a bug

See https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/524811

Thanks


Thanks Arlo!

Sergey:

It is possible that Arlo's bugfix will satisfy your use case.


It would have helped if I had actually seen Arlo's patch before I sent 
that email - he was fixing a case where we were not adding a nowiki 
where it should have been added.


So, you will need to pass the scrub_wikitext parameter if you want to 
avoid the nowikis. Or, you can normalize the HTML yourself before 
passing it to Parsoid.


Or, if you were just reporting the inconsistency, ignore my emails. :-)

Subbu.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] leading space and tag

2019-07-22 Thread Subramanya Sastry

On 7/22/19 10:51 AM, Arlo Breault wrote:

On Jul 22, 2019, at 5:11 AM, Sergey F  wrote:

test2
  test3


The result of conversion is:

test2
test3


Yes, this looks like a bug

See https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/524811

Thanks


Thanks Arlo!

Sergey:

It is possible that Arlo's bugfix will satisfy your use case.

However, note that Parsoid will introduce  protection around 
characters that will parse differently if not escaped. So " foo" 
will convert to " foo". You can avoid this by passing 
the 'scrub_wikitext' flag to the html -> wikitext API endpoint [1]. This 
tells Parsoid to normalize[2] the input HTML to eliminate the need for 
those nowikis.


FYI in case this flag is pertinent to your use case.

Subbu.

1. 
https://www.mediawiki.org/wiki/Parsoid/API#For_HTML_-%3E_wikitext_requests


2. https://www.mediawiki.org/wiki/Parsoid/Normalizations


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] leading space and tag

2019-07-22 Thread Arlo Breault


> On Jul 22, 2019, at 5:11 AM, Sergey F  wrote:
> 
> test2
>  test3
> 
> 
> The result of conversion is:
> 
> test2
> test3
> 

Yes, this looks like a bug

See https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/524811

Thanks


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] leading space and tag

2019-07-22 Thread Sergey F

Hello,

I use parsoid to publish email messages into wiki and have a little 
issue.
Sometimes generated article has "preformatted" fragments that do not 
have any special formatting in source text.
After investigation I discovered that it is caused by spaces that start 
new line in HTML text.
When source HTML of email is viewed in browser these spaces do not have 
any effect, but after converting to wikitext they became part of markup.
Next, trying to discover they way parsoid works I have seen that 
normally these spaces became surronded with  tag, but in some 
circumtances it does not happen.


So I made test HTML file to see different results of converting:






test2
 test3


test2
 test3


textxtest2
 test3





The result of conversion is:

test2
 test3


test2
 test3


textxtest2
 test3


It seems that if new line is just at end of  tag,  is not 
inserted.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l