Re: [tw5] Re: Tiddlywiki and regexp

2019-09-17 Thread @TiddlyTweeter
Right.

Its clean when you have consecutive items.

I'm trying to work out what to do when you don't.

TT

On Tuesday, 17 September 2019 13:56:22 UTC+2, TonyM wrote:
>
> Yt?
>
> In this case I was extracting all list items from a more complex html 
> source then relisting the items.
> The result is clean with only list items.
>
> Regards
> Tony
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/721b9616-6ac5-46ef-bc9c-1115e8c46e56%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-09-17 Thread TonyM
Yt?

In this case I was extracting all list items from a more complex html source 
then relisting the items.
The result is clean with only list items.

Regards
Tony

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/a614eb08-5a88-4f93-9284-140b022e6f6b%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-09-17 Thread @TiddlyTweeter
Ciao Mark

I'm late on this. I got really interested in this kind of extraction, which 
I think there is demand for.

Two issues I can't figure out ...

1 - does "<$vars realchars="[^\s]+">" need to be that? Rather than its 
inverse "<$vars realchars="\S+">"? (Where you would not need the variable 
as no need for 
"[...]"??)

2 - WHEN you have text BETWEEN tags, is there a way to dump it?

Only if you have time and interest!

Best wishes
TT 


Mark S. wrote:
>
> Actually, the tool we have for regexp is also a bit lacking. There's no 
> tool for directly lifting desired target text. The new splitregexp only 
> splits, it doesn't 
> return the text we want to find. Here's my version that does most 
> literally what you ask for
>
> <$vars realchars="[^\s]+">
> <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
> ]butfirst[1]splitregexp[]butlast[1]regexp]">
>
> 
> 
>
> Input:
>
> More text here
> line 3
> line 2
> line 1
> More text there
>
> Output
>
>
> line 3 
> line 2 
> line 1 
>
>
>
> Good luck!
>
> On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:
>>
>> Jeremy,
>>
>> You are aware I do not want so much to parse it as locate the content 
>> between matching tags.
>>
>> Its intention is to access content delimited by html tags inside the text 
>> content.
>>
>> Perhaps we could use it to retrieve items between the section div tags or 
>> all instances of text between the li tags.
>>
>> Regards
>> Tony
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/fb977184-2b39-4666-a3b7-4a1100f51afb%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-09-17 Thread @TiddlyTweeter
TonyM

It makes great sense to throw away unneeded text BETWEEN tags.

Unfortunately I could not get your version to work.

As far as I can see it just re-adds tags you just took off, and also adds 
them to text you need to excise.

Yes?

TT

On Sunday, 25 August 2019 05:50:46 UTC+2, TonyM wrote:
>
> Mark,
>
> Thanks for this, I only just got to test this; A Test tiddler as follows 
> is not working as I may expect it
> zfdtshwfthf
> Content
> sfghn
> Content2
>
> sfghsfgh
> Content3
> sxgfhfgsdh
>
> I would have hoped it would return
> Content
> Content2
> Content3
>
>
> If it was to return only the content between the ` and ` and not 
> any other content from the test tiddler I could do this;
> \define output()
> <$vars realchars="[^\s]+">
> <$list 
> filter="[{test data}splitregexp[\n]join[ 
> ]splitregexp[]butfirst[1]splitregexp[]butlast[1]regexpaddprefix[]addsuffix[]]"
> >
>
> 
> 
> \end
> <$wikify name=result text="<>">
> <>
> 
> Which would find all list items in test (HTML copied from somewhere) and 
> create a new list of only list (li) items in the HTML
>
> Does that make sense?
>
> Regards
> Tony
>
> On Friday, August 23, 2019 at 2:08:20 AM UTC+10, Mark S. wrote:
>>
>> Re your 2nd question, you can make the filter slightly more robust:
>>
>> [{test}splitregexp[\n]join[ ]splitregexp[]butfirst[1]splitregexp
>> []butlast[1]regexp]
>>
>> Re your 1st question, I don't believe you can do this in a single filter. 
>> It will probably take multiple lines if possible at all. Because, there are 
>> no core tools
>> for grabbing the actual text you want -- only for splitting. People have 
>> done a lot with splitting, but it gets tedious.
>>
>> If you had a regular expression filter that could split and return groups 
>> (e.g. #2963) then you could simply search for and lift out the  
>> group and the content group in one regular expression.
>>
>> On Thursday, August 22, 2019 at 7:58:06 AM UTC-7, TonyM wrote:
>>>
>>> Mark - Wow,
>>>
>>> I will test it out tomorrow to see how far I can take it. 
>>>
>>> I hope it works for multi-line tags
>>>
>>> My interest would be also the option to return
>>> line 3
>>> line 2
>>> line 1
>>> or
>>> line 3
>>> line 2 
>>> line 1 
>>> Because keeping the valid tags can be made use of as well.
>>>
>>> Ahd also see how to handle If the list tag had a style eg >> style="something"> it would be nice if we could return
>>> line 1
>>> or
>>> line 1
>>>
>>> If so a lot can be done to extract useful content from html, even if 
>>> just to summarise some content.
>>>
>>> Perhaps further resolution would help like >> name=extract>content
>>>
>>> Or extract list items.
>>>
>>> Even without using html a tiddlers text field could use html block and 
>>> inline elements https://www.w3schools.com/html/html_blocks.asp to 
>>> structure the content, and with such a regex macro extract parts of the 
>>> tiddler text such as say a prepared extract from the content, or an 
>>> excerpt, or a config settings or more.
>>>
>>> Regards
>>> Tony
>>>
>>>
>>> On Friday, August 23, 2019 at 12:22:47 AM UTC+10, Mark S. wrote:


 There's that saying, "When all you have is a hammer, everything starts 
 to look like a nail."

 All we have is regex. It would be great to have some other tool for 
 extracting actual DOM-like structures the way you
 could with TW classic. But we don't have it.

 Actually, the tool we have for regexp is also a bit lacking. There's no 
 tool for directly lifting desired target text. The new splitregexp only 
 splits, it doesn't 
 return the text we want to find. Here's my version that does most 
 literally what you ask for

 <$vars realchars="[^\s]+">
 <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
 ]butfirst[1]splitregexp[]butlast[1]regexp]">

 
 

 Input:

 More text here
 line 3
 line 2
 line 1
 More text there

 Output


 line 3 
 line 2 
 line 1 



 Good luck!

 On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:
>
> Jeremy,
>
> You are aware I do not want so much to parse it as locate the content 
> between matching tags.
>
> Its intention is to access content delimited by html tags inside the 
> text content.
>
> Perhaps we could use it to retrieve items between the section div tags 
> or all instances of text between the li tags.
>
> Regards
> Tony
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-06 Thread @TiddlyTweeter
Regex is very powerful and often confusing :-)

For instance ...

*^(([^/]*?)/){1,}[^/]*?$*

Is actually, functionally, the same as the very simple ...

*/*

Test these at: 
http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title

The point is whether to have a more complex regex that can do a lot that is 
precise in changing via  *{1,} *or simply match the immediate need that a 
Tiddler needs contain a "/" slash.

Its a pragmatic tool.

TT


On Friday, 6 September 2019 11:38:24 UTC+2, @TiddlyTweeter wrote:
>
> Mohammad wrote:
>>
>> Does this pattern allows trailing slashes?
>>
>
> No. To do that you could use ...
>
> *^(([^/]*?)/){1,}[^/]*?$*
> This makes the negation classes "[^/]*?" matching "not /" of 0 or more 
> length (rather than 1 or more)
> This will match tiddlers ending "/", as well as cases where the title 
> could just be "///"
>
> By the way, if you want to see all titles with "/" use {1,} = 1 or more
>
> ---
>
> But to match that use case only, where you only wanted to list tiddlers 
> with a trailing "/"  use this simple pattern :-) ...
>
> */$*
>
> Part of the art with regex is determining when to be minimal and when to 
> go for something with wider matching power but more complexity.
> The more complex it gets the more important it gets to test against data 
> to be sure it works as expected.
>
> TT
>  
>
>>
>>
>> *Advanced use of the Negated Character Class*
>>>
>>> *Match titles with defined numbers of "/" slash*
>>> *^(([^/]+?)/){1}[^/]+?$*
>>>
>>> The  difference here is in *{1}*
>>>
>>> If you change the number then it will change the number of "/" permitted 
>>> in the match.
>>>
>>> You can test it at: 
>>> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>>>  
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/7420763d-6b9c-44da-9209-72a2083d5e03%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-06 Thread @TiddlyTweeter
Mohammad wrote:
>
>  At https://regex101.com/ the below syntax return errors it needs the 
> slash character to be escaped!
>

Mark's detailed answer to why that happens (and why it is *not *an issue in 
TW) is really clear.

I think it may be worth adding a note about the way that TW relates to the 
underlying JS regex engine. 
Tools like TW that access the engine usually use an interface that is 
defined  by the end programmers.

Its part of the same issue of understanding how the "scope" flags "g" and 
"m" are invoked in TW. 
This can be important in matching in the text field, which TW can do, but 
needs a bit more documentation to be optimally used.

TT 

On Friday, 6 September 2019 06:29:13 UTC+2, Mark S. wrote:
>
> There's a difference in javascript between a regular expression, and a 
> string that can be interpreted as a regular expression.
>
> If you notice at regex101, the input box has* / *at the start and end. So 
> it's assuming a direct regular expression, like:
>
> */*^(([^/]+?)/){1}[^/]+?$*/*gm
>
> The slashes are used to indicate the start and end of the expression, and 
> so any slashes in the middle not part of a character class throw an error.
>
> But we're actually passing a string here. So internally something like 
> this is happening:
>
> var patt = new RegExp(*"^(([^/]+?)/){1}[^/]+?$"*) ;
>
> Since the forward slash is not needed to delimit the expression when the 
> regular expression is created this way, it doesn't throw an error inside of 
> TW.
> It's unfortunate that the tool at regex101 doesn't allow you to enter the 
> expression as a string.
>
> -- Mark
>
>>
>> *Advanced use of the Negated Character Class*
>>>
>>> *Match titles with defined numbers of "/" slash*
>>> *^(([^/]+?)/){1}[^/]+?$*
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/e6bd5cbe-1b1c-4e8f-b53d-3504d05546e5%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-06 Thread @TiddlyTweeter
Mohammad wrote:
>
> Does this pattern allows trailing slashes?
>

No. To do that you could use ...

*^(([^/]*?)/){1,}[^/]*?$*
This makes the negation classes "[^/]*?" matching "not /" of 0 or more 
length (rather than 1 or more)
This will match tiddlers ending "/", as well as cases where the title could 
just be "///"

By the way, if you want to see all titles with "/" use {1,} = 1 or more

---

But to match that use case only, where you only wanted to list tiddlers 
with a trailing "/"  use this simple pattern :-) ...

*/$*

Part of the art with regex is determining when to be minimal and when to go 
for something with wider matching power but more complexity.
The more complex it gets the more important it gets to test against data to 
be sure it works as expected.

TT
 

>
>
> *Advanced use of the Negated Character Class*
>>
>> *Match titles with defined numbers of "/" slash*
>> *^(([^/]+?)/){1}[^/]+?$*
>>
>> The  difference here is in *{1}*
>>
>> If you change the number then it will change the number of "/" permitted 
>> in the match.
>>
>> You can test it at: 
>> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/5dff1cfa-c637-4321-a4b0-0e0724659901%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-06 Thread @TiddlyTweeter
Ciao Mohammad

As it is it has little utility. I included it for learning purposes. A very 
simple case that illustrates what Negative Classes do.

I don't know how much regex users have, so I think for docs its useful to 
give some really simple examples and then build from them.

Best wishes
TT

Mohammad wrote:
>
>  How we can use this in a real case? Not starting with $ sign means all 
> ordinary tiddlers!
>
>>
>> Match titles NOT starting "$"
>>
>> *^[^\$]*
>>
>>
>> *"^"* = start of scope, in this case the start of the title field
>> *"[^"* = inside a character class, in first position, *"^"* means "match 
>> the negation" of the following character(s)
>> *"\$"* = match the character "$" literally
>> *"]" *= close character class
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/2d85f848-ec55-4148-b1c9-17d8ee477d6c%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
Hi Mark,

Thanks for clarification!


Best wishes
Mohammad


On Fri, Sep 6, 2019 at 8:59 AM 'Mark S.' via TiddlyWiki <
tiddlywiki@googlegroups.com> wrote:

> There's a difference in javascript between a regular expression, and a
> string that can be interpreted as a regular expression.
>
> If you notice at regex101, the input box has* / *at the start and end. So
> it's assuming a direct regular expression, like:
>
> */*^(([^/]+?)/){1}[^/]+?$*/*gm
>
> The slashes are used to indicate the start and end of the expression, and
> so any slashes in the middle not part of a character class throw an error.
>
> But we're actually passing a string here. So internally something like
> this is happening:
>
> var patt = new RegExp(*"^(([^/]+?)/){1}[^/]+?$"*) ;
>
> Since the forward slash is not needed to delimit the expression when the
> regular expression is created this way, it doesn't throw an error inside of
> TW.
> It's unfortunate that the tool at regex101 doesn't allow you to enter the
> expression as a string.
>
> -- Mark
>
>
>
> On Thursday, September 5, 2019 at 8:48:08 PM UTC-7, Mohammad wrote:
>>
>> TT,
>>  At https://regex101.com/ the below syntax return errors it needs the
>> slash character to be escaped!
>>
>> Please have a look
>>
>>
>> Best wishes
>> Mohammad
>>
>>
>> On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter 
>> wrote:
>>
>>> *Advanced use of the Negated Character Class*
>>>
>>> *Match titles with defined numbers of "/" slash*
>>> *^(([^/]+?)/){1}[^/]+?$*
>>>
>>> The  difference here is in *{1}*
>>>
>>> If you change the number then it will change the number of "/" permitted
>>> in the match.
>>>
>>> You can test it at:
>>> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>>>
>>>
>>> TT
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "TiddlyWiki" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tiddl...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/7262fe0a-4a4b-402f-a094-f0cf3ee6f94f%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMBk6Hk3Z_Q5Rx%2B576zU_NqD8ydW7WMD1hcb4U3C_fhAJw%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread 'Mark S.' via TiddlyWiki
There's a difference in javascript between a regular expression, and a 
string that can be interpreted as a regular expression.

If you notice at regex101, the input box has* / *at the start and end. So 
it's assuming a direct regular expression, like:

*/*^(([^/]+?)/){1}[^/]+?$*/*gm

The slashes are used to indicate the start and end of the expression, and 
so any slashes in the middle not part of a character class throw an error.

But we're actually passing a string here. So internally something like this 
is happening:

var patt = new RegExp(*"^(([^/]+?)/){1}[^/]+?$"*) ;

Since the forward slash is not needed to delimit the expression when the 
regular expression is created this way, it doesn't throw an error inside of 
TW.
It's unfortunate that the tool at regex101 doesn't allow you to enter the 
expression as a string.

-- Mark



On Thursday, September 5, 2019 at 8:48:08 PM UTC-7, Mohammad wrote:
>
> TT,
>  At https://regex101.com/ the below syntax return errors it needs the 
> slash character to be escaped!
>
> Please have a look
>
>
> Best wishes
> Mohammad
>
>
> On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter  > wrote:
>
>> *Advanced use of the Negated Character Class*
>>
>> *Match titles with defined numbers of "/" slash*
>> *^(([^/]+?)/){1}[^/]+?$*
>>
>> The  difference here is in *{1}*
>>
>> If you change the number then it will change the number of "/" permitted 
>> in the match.
>>
>> You can test it at: 
>> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title 
>>
>> TT
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "TiddlyWiki" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tiddl...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/7262fe0a-4a4b-402f-a094-f0cf3ee6f94f%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
More question:
Does this pattern allows trailing slashes?


Best wishes
Mohammad


On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter 
wrote:

> *Advanced use of the Negated Character Class*
>
> *Match titles with defined numbers of "/" slash*
> *^(([^/]+?)/){1}[^/]+?$*
>
> The  difference here is in *{1}*
>
> If you change the number then it will change the number of "/" permitted
> in the match.
>
> You can test it at:
> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>
> TT
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMAifDHzCj-JQTAzpv8KEur%3DJuBQx5yT-xoatc%2B4rdsMyQ%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
TT,
 At https://regex101.com/ the below syntax return errors it needs the slash
character to be escaped!

Please have a look


Best wishes
Mohammad


On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter 
wrote:

> *Advanced use of the Negated Character Class*
>
> *Match titles with defined numbers of "/" slash*
> *^(([^/]+?)/){1}[^/]+?$*
>
> The  difference here is in *{1}*
>
> If you change the number then it will change the number of "/" permitted
> in the match.
>
> You can test it at:
> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>
> TT
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMD0_KsKAru210E04dtRKJBtYAHeKtsk78BGR_KZe3aZxA%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
TT,
 How we can use this in a real case? Not starting with $ sign means all
ordinary tiddlers!


Best wishes
Mohammad


On Thu, Sep 5, 2019 at 3:06 PM @TiddlyTweeter 
wrote:

> *Example of using the Negated Character Class*
>
> Very useful regex syntax. Often much more economical than using a positive
> character class.
>
> Match titles NOT starting "$"
>
> *^[^\$]*
>
>
> *"^"* = start of scope, in this case the start of the title field
> *"[^"* = inside a character class, in first position, *"^"* means "match
> the negation" of the following character(s)
> *"\$"* = match the character "$" literally
> *"]" *= close character class
>
> To use this in a filter the regex pattern needs to be put into a variable
> and then invoked. See example here:
> https://groups.google.com/d/msg/tiddlywiki/TOUdt8ZjTa4/5v3wiF6fAQAJ
>
> You can test it at Mohammad's regex documentation site:
> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>
> TT
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/51b95aba-4633-421f-a3f4-792669912fac%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMAZTCNmxs5heSrT2-1PVc6vL_q85x9u8L2EWn7-oG5rjQ%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
This is great to be able to experiment at tw-regexp when you give a new
example.
So, I highly recommend to test and lets user try it at tw-regexp.

Cheers
Mohammad



On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter 
wrote:

> *Advanced use of the Negated Character Class*
>
> *Match titles with defined numbers of "/" slash*
> *^(([^/]+?)/){1}[^/]+?$*
>
> The  difference here is in *{1}*
>
> If you change the number then it will change the number of "/" permitted
> in the match.
>
> You can test it at:
> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>
> TT
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMBDdKknspBT5icmHqZ2kQSABT5n60mi-OLQg9BXE%2B7pvg%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp examples part ii: Working with fields

2019-09-05 Thread Mohammad Rahmani
Thanks TT.
I will add these to tw-regexp.

Yes, we did not focus on negated character which can be helpful in many use
cases.


Best wishes
Mohammad


On Thu, Sep 5, 2019 at 4:05 PM @TiddlyTweeter 
wrote:

> *Advanced use of the Negated Character Class*
>
> *Match titles with defined numbers of "/" slash*
> *^(([^/]+?)/){1}[^/]+?$*
>
> The  difference here is in *{1}*
>
> If you change the number then it will change the number of "/" permitted
> in the match.
>
> You can test it at:
> http://tw-regexp.tiddlyspot.com/#RegExp%20Experimentation%20with%20Title
>
> TT
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tiddlywiki/166333e8-b354-47a0-b685-b47886566827%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/CAAV1gMCHWQVxqdfQmq%2BBk6K%2B3Vg%2BYVYBGNoNKNYdgLf_MyAQPw%40mail.gmail.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-24 Thread TonyM
Mark,

Thanks for this, I only just got to test this; A Test tiddler as follows is 
not working as I may expect it
zfdtshwfthf
Content
sfghn
Content2

sfghsfgh
Content3
sxgfhfgsdh

I would have hoped it would return
Content
Content2
Content3


If it was to return only the content between the ` and ` and not 
any other content from the test tiddler I could do this;
\define output()
<$vars realchars="[^\s]+">
<$list 
filter="[{test data}splitregexp[\n]join[ 
]splitregexp[]butfirst[1]splitregexp[]butlast[1]regexpaddprefix[]addsuffix[]]"
>



\end
<$wikify name=result text="<>">
<>

Which would find all list items in test (HTML copied from somewhere) and 
create a new list of only list (li) items in the HTML

Does that make sense?

Regards
Tony

On Friday, August 23, 2019 at 2:08:20 AM UTC+10, Mark S. wrote:
>
> Re your 2nd question, you can make the filter slightly more robust:
>
> [{test}splitregexp[\n]join[ ]splitregexp[]butfirst[1]splitregexp[ li>]butlast[1]regexp]
>
> Re your 1st question, I don't believe you can do this in a single filter. 
> It will probably take multiple lines if possible at all. Because, there are 
> no core tools
> for grabbing the actual text you want -- only for splitting. People have 
> done a lot with splitting, but it gets tedious.
>
> If you had a regular expression filter that could split and return groups 
> (e.g. #2963) then you could simply search for and lift out the  
> group and the content group in one regular expression.
>
> On Thursday, August 22, 2019 at 7:58:06 AM UTC-7, TonyM wrote:
>>
>> Mark - Wow,
>>
>> I will test it out tomorrow to see how far I can take it. 
>>
>> I hope it works for multi-line tags
>>
>> My interest would be also the option to return
>> line 3
>> line 2
>> line 1
>> or
>> line 3
>> line 2 
>> line 1 
>> Because keeping the valid tags can be made use of as well.
>>
>> Ahd also see how to handle If the list tag had a style eg > style="something"> it would be nice if we could return
>> line 1
>> or
>> line 1
>>
>> If so a lot can be done to extract useful content from html, even if just 
>> to summarise some content.
>>
>> Perhaps further resolution would help like > name=extract>content
>>
>> Or extract list items.
>>
>> Even without using html a tiddlers text field could use html block and 
>> inline elements https://www.w3schools.com/html/html_blocks.asp to 
>> structure the content, and with such a regex macro extract parts of the 
>> tiddler text such as say a prepared extract from the content, or an 
>> excerpt, or a config settings or more.
>>
>> Regards
>> Tony
>>
>>
>> On Friday, August 23, 2019 at 12:22:47 AM UTC+10, Mark S. wrote:
>>>
>>>
>>> There's that saying, "When all you have is a hammer, everything starts 
>>> to look like a nail."
>>>
>>> All we have is regex. It would be great to have some other tool for 
>>> extracting actual DOM-like structures the way you
>>> could with TW classic. But we don't have it.
>>>
>>> Actually, the tool we have for regexp is also a bit lacking. There's no 
>>> tool for directly lifting desired target text. The new splitregexp only 
>>> splits, it doesn't 
>>> return the text we want to find. Here's my version that does most 
>>> literally what you ask for
>>>
>>> <$vars realchars="[^\s]+">
>>> <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
>>> ]butfirst[1]splitregexp[]butlast[1]regexp]">
>>>
>>> 
>>> 
>>>
>>> Input:
>>>
>>> More text here
>>> line 3
>>> line 2
>>> line 1
>>> More text there
>>>
>>> Output
>>>
>>>
>>> line 3 
>>> line 2 
>>> line 1 
>>>
>>>
>>>
>>> Good luck!
>>>
>>> On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:

 Jeremy,

 You are aware I do not want so much to parse it as locate the content 
 between matching tags.

 Its intention is to access content delimited by html tags inside the 
 text content.

 Perhaps we could use it to retrieve items between the section div tags 
 or all instances of text between the li tags.

 Regards
 Tony



-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/e7272cb1-b215-4c85-9298-8bcf612f5460%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread Mohammad
Many thanks for clarification.

I need those explanation when documenting your solution in TW-Scripts.

Cheers
Mohammad

On Thursday, August 22, 2019 at 9:04:06 PM UTC+4:30, Mark S. wrote:
>
> That's a regular expression that says "matcch anything that is not 
> whitespace". It's used to verify that
> a line is not empty. It has to be defined in a variable beause it contains 
> square brackets [].
>
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/4517ef63-4622-4427-8b68-f29db66ebcef%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread 'Mark S.' via TiddlyWiki
That's a regular expression that says "matcch anything that is not whitespace". 
It's used to verify that
a line is not empty. It has to be defined in a variable beause it contains 
square brackets [].

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/e8f8e523-4578-44fd-9377-72752de33d2d%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread Mohammad
Added to TW-Scripts!

Mark,
 What  the part do?

<$vars realchars="[^\s]+">

--Mohammad

On Thursday, August 22, 2019 at 8:38:20 PM UTC+4:30, Mark S. wrote:
>
> Re your 2nd question, you can make the filter slightly more robust:
>
> [{test}splitregexp[\n]join[ ]splitregexp[]butfirst[1]splitregexp[ li>]butlast[1]regexp]
>
> Re your 1st question, I don't believe you can do this in a single filter. 
> It will probably take multiple lines if possible at all. Because, there are 
> no core tools
> for grabbing the actual text you want -- only for splitting. People have 
> done a lot with splitting, but it gets tedious.
>
> If you had a regular expression filter that could split and return groups 
> (e.g. #2963) then you could simply search for and lift out the  
> group and the content group in one regular expression.
>
> On Thursday, August 22, 2019 at 7:58:06 AM UTC-7, TonyM wrote:
>>
>> Mark - Wow,
>>
>> I will test it out tomorrow to see how far I can take it. 
>>
>> I hope it works for multi-line tags
>>
>> My interest would be also the option to return
>> line 3
>> line 2
>> line 1
>> or
>> line 3
>> line 2 
>> line 1 
>> Because keeping the valid tags can be made use of as well.
>>
>> Ahd also see how to handle If the list tag had a style eg > style="something"> it would be nice if we could return
>> line 1
>> or
>> line 1
>>
>> If so a lot can be done to extract useful content from html, even if just 
>> to summarise some content.
>>
>> Perhaps further resolution would help like > name=extract>content
>>
>> Or extract list items.
>>
>> Even without using html a tiddlers text field could use html block and 
>> inline elements https://www.w3schools.com/html/html_blocks.asp to 
>> structure the content, and with such a regex macro extract parts of the 
>> tiddler text such as say a prepared extract from the content, or an 
>> excerpt, or a config settings or more.
>>
>> Regards
>> Tony
>>
>>
>> On Friday, August 23, 2019 at 12:22:47 AM UTC+10, Mark S. wrote:
>>>
>>>
>>> There's that saying, "When all you have is a hammer, everything starts 
>>> to look like a nail."
>>>
>>> All we have is regex. It would be great to have some other tool for 
>>> extracting actual DOM-like structures the way you
>>> could with TW classic. But we don't have it.
>>>
>>> Actually, the tool we have for regexp is also a bit lacking. There's no 
>>> tool for directly lifting desired target text. The new splitregexp only 
>>> splits, it doesn't 
>>> return the text we want to find. Here's my version that does most 
>>> literally what you ask for
>>>
>>> <$vars realchars="[^\s]+">
>>> <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
>>> ]butfirst[1]splitregexp[]butlast[1]regexp]">
>>>
>>> 
>>> 
>>>
>>> Input:
>>>
>>> More text here
>>> line 3
>>> line 2
>>> line 1
>>> More text there
>>>
>>> Output
>>>
>>>
>>> line 3 
>>> line 2 
>>> line 1 
>>>
>>>
>>>
>>> Good luck!
>>>
>>> On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:

 Jeremy,

 You are aware I do not want so much to parse it as locate the content 
 between matching tags.

 Its intention is to access content delimited by html tags inside the 
 text content.

 Perhaps we could use it to retrieve items between the section div tags 
 or all instances of text between the li tags.

 Regards
 Tony



-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/9d72d049-e484-409d-a01e-ad30389dbbce%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread 'Mark S.' via TiddlyWiki
Re your 2nd question, you can make the filter slightly more robust:

[{test}splitregexp[\n]join[ ]splitregexp[]butfirst[1]splitregexp[]butlast[1]regexp]

Re your 1st question, I don't believe you can do this in a single filter. 
It will probably take multiple lines if possible at all. Because, there are 
no core tools
for grabbing the actual text you want -- only for splitting. People have 
done a lot with splitting, but it gets tedious.

If you had a regular expression filter that could split and return groups 
(e.g. #2963) then you could simply search for and lift out the  
group and the content group in one regular expression.

On Thursday, August 22, 2019 at 7:58:06 AM UTC-7, TonyM wrote:
>
> Mark - Wow,
>
> I will test it out tomorrow to see how far I can take it. 
>
> I hope it works for multi-line tags
>
> My interest would be also the option to return
> line 3
> line 2
> line 1
> or
> line 3
> line 2 
> line 1 
> Because keeping the valid tags can be made use of as well.
>
> Ahd also see how to handle If the list tag had a style eg  style="something"> it would be nice if we could return
> line 1
> or
> line 1
>
> If so a lot can be done to extract useful content from html, even if just 
> to summarise some content.
>
> Perhaps further resolution would help like  name=extract>content
>
> Or extract list items.
>
> Even without using html a tiddlers text field could use html block and 
> inline elements https://www.w3schools.com/html/html_blocks.asp to 
> structure the content, and with such a regex macro extract parts of the 
> tiddler text such as say a prepared extract from the content, or an 
> excerpt, or a config settings or more.
>
> Regards
> Tony
>
>
> On Friday, August 23, 2019 at 12:22:47 AM UTC+10, Mark S. wrote:
>>
>>
>> There's that saying, "When all you have is a hammer, everything starts to 
>> look like a nail."
>>
>> All we have is regex. It would be great to have some other tool for 
>> extracting actual DOM-like structures the way you
>> could with TW classic. But we don't have it.
>>
>> Actually, the tool we have for regexp is also a bit lacking. There's no 
>> tool for directly lifting desired target text. The new splitregexp only 
>> splits, it doesn't 
>> return the text we want to find. Here's my version that does most 
>> literally what you ask for
>>
>> <$vars realchars="[^\s]+">
>> <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
>> ]butfirst[1]splitregexp[]butlast[1]regexp]">
>>
>> 
>> 
>>
>> Input:
>>
>> More text here
>> line 3
>> line 2
>> line 1
>> More text there
>>
>> Output
>>
>>
>> line 3 
>> line 2 
>> line 1 
>>
>>
>>
>> Good luck!
>>
>> On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:
>>>
>>> Jeremy,
>>>
>>> You are aware I do not want so much to parse it as locate the content 
>>> between matching tags.
>>>
>>> Its intention is to access content delimited by html tags inside the 
>>> text content.
>>>
>>> Perhaps we could use it to retrieve items between the section div tags 
>>> or all instances of text between the li tags.
>>>
>>> Regards
>>> Tony
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/fd507336-3981-4657-9abd-db3d41024a6c%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread @TiddlyTweeter
Mark, S. 

> All we have is regex. 
>
 

> It would be great to have some other tool for extracting actual DOM-like 
> structures the way you could with TW classic. But we don't have it.
>

Actually, the tool we have for regexp is also a bit lacking. There's no 
> tool for directly lifting desired target text.
>

I'd be interested in better documenting the regex operators TW has in the 
context of what JS regex can do. 
I strongly believe it needs referents, i.e. informed by what regex "match" 
AND "replace do" in standard JS,

In raw form (pre-parser intervention) TW can, of course, do anything JS 
can, but that is not at the level where most are working.

TT







-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/b3e3271e-52a9-489e-8b7a-8e4565bf35b5%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread TonyM
Mark - Wow,

I will test it out tomorrow to see how far I can take it. 

I hope it works for multi-line tags

My interest would be also the option to return
line 3
line 2
line 1
or
line 3
line 2 
line 1 
Because keeping the valid tags can be made use of as well.

Ahd also see how to handle If the list tag had a style eg  it would be nice if we could return
line 1
or
line 1

If so a lot can be done to extract useful content from html, even if just 
to summarise some content.

Perhaps further resolution would help like content

Or extract list items.

Even without using html a tiddlers text field could use html block and 
inline elements https://www.w3schools.com/html/html_blocks.asp to structure 
the content, and with such a regex macro extract parts of the tiddler text 
such as say a prepared extract from the content, or an excerpt, or a config 
settings or more.

Regards
Tony


On Friday, August 23, 2019 at 12:22:47 AM UTC+10, Mark S. wrote:
>
>
> There's that saying, "When all you have is a hammer, everything starts to 
> look like a nail."
>
> All we have is regex. It would be great to have some other tool for 
> extracting actual DOM-like structures the way you
> could with TW classic. But we don't have it.
>
> Actually, the tool we have for regexp is also a bit lacking. There's no 
> tool for directly lifting desired target text. The new splitregexp only 
> splits, it doesn't 
> return the text we want to find. Here's my version that does most 
> literally what you ask for
>
> <$vars realchars="[^\s]+">
> <$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
> ]butfirst[1]splitregexp[]butlast[1]regexp]">
>
> 
> 
>
> Input:
>
> More text here
> line 3
> line 2
> line 1
> More text there
>
> Output
>
>
> line 3 
> line 2 
> line 1 
>
>
>
> Good luck!
>
> On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:
>>
>> Jeremy,
>>
>> You are aware I do not want so much to parse it as locate the content 
>> between matching tags.
>>
>> Its intention is to access content delimited by html tags inside the text 
>> content.
>>
>> Perhaps we could use it to retrieve items between the section div tags or 
>> all instances of text between the li tags.
>>
>> Regards
>> Tony
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/69ec934d-1330-4961-9758-e2ce91c80e60%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread 'Mark S.' via TiddlyWiki

There's that saying, "When all you have is a hammer, everything starts to 
look like a nail."

All we have is regex. It would be great to have some other tool for 
extracting actual DOM-like structures the way you
could with TW classic. But we don't have it.

Actually, the tool we have for regexp is also a bit lacking. There's no 
tool for directly lifting desired target text. The new splitregexp only 
splits, it doesn't 
return the text we want to find. Here's my version that does most literally 
what you ask for

<$vars realchars="[^\s]+">
<$list filter="[{test}splitregexp[\n]join[ ]splitregexp[
]butfirst[1]splitregexp[]butlast[1]regexp]">




Input:

More text here
line 3
line 2
line 1
More text there

Output


line 3 
line 2 
line 1 



Good luck!

On Thursday, August 22, 2019 at 2:21:34 AM UTC-7, TonyM wrote:
>
> Jeremy,
>
> You are aware I do not want so much to parse it as locate the content 
> between matching tags.
>
> Its intention is to access content delimited by html tags inside the text 
> content.
>
> Perhaps we could use it to retrieve items between the section div tags or 
> all instances of text between the li tags.
>
> Regards
> Tony
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/11248d8f-46ca-45d6-b3fa-670f8b6c0c80%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread @TiddlyTweeter
Jeremy

I just saw this and thought it very interesting!

Jeremy Ruston wrote
>
> There's an old trope in software that one should never use regexps to 
> parse HTML:
>
> https://blog.codinghorror.c 
> 
> om/parsing-html-the-cthulhu-way/ 
> 
>
> So, while I'd be happy to see general regexp support improved in TW5, I 
> don't think it's appropriate to specifically shape that support for the 
> task of parsing HTML.
>
> Of course, TW5 already includes an HTML parser so perhaps the best 
> approach might be to explore how to make that functionality be more 
> usefully exposed to wikitext.
>

I agree that the base parsers are a good way to come at it. Why? Because 
they work with primitive, good, aims. They not try to regex everything. And 
are inside ASTs that give order.

*That said, I think the extent of use of regex under the TW hood is a 
potential revelation to many, and worth understanding more.*

Regarding https://blog.codinghorror.c 

om/parsing-html-the-cthulhu-way/ 
,
 
its exaggerated. IMO, its a "straw man" argument about regex. Its got 
points, but way overstated.

A more measured approach is that simple structures can be built quite well 
with regex directly, but its not a tool for complex situations. *Its merely 
a great tool for textual deconstruction and reconstruction*. At that its 
excellent.
 
Best wishes
TT

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/68a12063-b269-4b33-8ea5-d41285170aea%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread TonyM
I will just add I am method agnostic. I just want the ability to in effect 
transclued the content between some open and close tags.

Regards
Tony

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/0a26f856-d914-4973-9c06-adeb1f7599df%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread TonyM
Jeremy,

You are aware I do not want so much to parse it as locate the content between 
matching tags.

Its intention is to access content delimited by html tags inside the text 
content.

Perhaps we could use it to retrieve items between the section div tags or all 
instances of text between the li tags.

Regards
Tony

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/f0ca4c97-7e59-4b8a-9202-8fd6d18e2239%40googlegroups.com.


Re: [tw5] Re: Tiddlywiki and regexp

2019-08-22 Thread Jeremy Ruston
Hi Tony

There's an old trope in software that one should never use regexps to parse 
HTML:

https://blog.codinghorror.com/parsing-html-the-cthulhu-way/

So, while I'd be happy to see general regexp support improved in TW5, I don't 
think it's appropriate to specifically shape that support for the task of 
parsing HTML.

Of course, TW5 already includes an HTML parser so perhaps the best approach 
might be to explore how to make that functionality be more usefully exposed to 
wikitext.

Best wishes

Jeremy

--
Jeremy Ruston
jer...@jermolene.com
https://jermolene.com

> On 22 Aug 2019, at 01:37, TonyM  wrote:
> 
> 
> Folks,
> 
> I have a great use case for some advanced regex. I would like to provide a 
> macro with a tiddlername (default Current Tiddler) and field (default text) 
> and a html tag eg `, ,  ,   etc.. as 
> documented here https://www.w3schools.com/html/html_blocks.asp
> 
> I would like a regex to search the target(s) for the html pairs eg `A 
> List items` or `A List items` and and return result 
> optionaly with the html tags still present or only the content between them.
> 
> Perhaps later we could enhance this to interrogate id's and other tag info eg 
> ` `
> 
> I think this could be implemented in a subfilter eg
> \define html-tags() regex filter
> <$set name=html-tag value="article">
> <$list filter="[[tiddlername]subfilter]">
> 
> 
> 
> 
> See here https://groups.google.com/d/msg/tiddlywiki/s9Y_w85282I/6na45l5KAAAJ
> 
> Regards
> Tony
> 
> 
> 
>> On Friday, August 16, 2019 at 1:56:24 AM UTC+10, Mohammad wrote:
>> As Tiddlywiki supports regexp and this feature is quite powerful, yet many 
>> of us (like me) has no or little information about it, I would like to 
>> introduce
>> this small resource for easy and quick learning regexp.
>> 
>> https://github.com/ziishaned/learn-regex
>> 
>> 
>> Sometimes ago Tiddly Twitter had a discussion to give examples of using 
>> regexp in TW, but seems he is very busy with Polly!
>> 
>> 
>> By the way if someone can create a wiki in GitHub or tiddlyspot with 
>> practical examples of using regexp, it would be quite useful.
>> 
>> 
>> Cheers
>> Mohammad
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "TiddlyWiki" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to tiddlywiki+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/tiddlywiki/43b26b25-3d08-4c7f-9ef4-ea4daacfd50e%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/0C9D2162-7B36-4CB5-B569-550327F50537%40gmail.com.