date:20160913

Thank you for this Timo, and to everyone else who replied.  It did indeed 
address what I wanted to know. -- Darren Duncan


On 2016-09-13 5:15 AM, Timo Paulssen wrote:

I'll answer based on the data structures MoarVM uses internally:

On 09/13/2016 05:13 AM, Darren Duncan wrote:> (Pretend the actual hardware has
infinite memory so we wouldn't run > out of hardware first.) > > 1.  Maximum
size of a (non-machine) integer? We're using libtommath, which declares the
"used" and "alloc" fields of the mp_int as "int", iow a 32bit signed (???) 
integer.


2.  Maximum number of elements in an array?  MVMArray declares the "elems", "start" and 
"ssize" fields to be MVMuint64, so

they can become quite a bit bigger than strings.


3.  Maximum number of elements in a hash?  the "uthash" library we're using declares the 
"num_buckets" and "num_items"

fields as unsigned (so 32bit unsigned integer).


4.  Maximum number of bytes/codepoints/etc in a character string?  MVMString declares its 
"num_graphs" as a MVMuint32, but a graph can be

multiple codepoints and as such multiple bytes when encoded.


Following the above, does the Perl 6 language specify any such  > limits, or 
does it define the above things to be infinite? I don't think it

does.


Hope that helps!
   - Timo

Re: grammars and indentation of input

2016-09-13 Thread Theo van den Heuvel

As so often it turned out that the reason my program did not work was 
elsewhere (in the grammar).

My approach worked al along.
It was instructive to look at the examples you guys mentioned. Thanks

Theo

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Bennett Todd

Well put.

The clearest description of Python's approach I've read, explained it as a 
lexer that tracked indentation level, and inserted appropriate tokens when it 
changed.

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Aaron Sherman

Oh, a side point: there's some confusion introduced by the lack of a
scanner/lexer in modern all-in-one-parsers.

Python, for example, uses a scanner and so its grammar is nominally not
context sensitive, but its scanner very much is (maintaining a stack of
indentation exactly as OP was asking about). When you do everything in one
place, the result must have the ability to maintain and respond to global
state. There's really no other way around it. A pure BNF cannot parse
Python.

Aaron Sherman, M.:
P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com
Toolsmith, developer, gamer and life-long student.

On Tue, Sep 13, 2016 at 12:26 PM, Bennett Todd 
wrote:

> Hostile or not, thanks for your informative reply.
>

Re: grammars and indentation of input

2016-09-13 Thread Moritz Lenz

Hi,

On 13.09.2016 18:55, Patrick R. Michaud wrote:
> I don't have an example handy, but I can categorically say that
> Perl 6 grammars are designed to support exactly this form of parsing.
> It's almost exactly what I did in "pynie" -- a Python implementation
> on top of Perl 6.  The parsing was done using a Perl 6 grammar.

See https://github.com/arnsholt/snake for a newer implementation that
parses python (or subsets thereof). Its grammar is likely inspired by
pynie, but your chances to get it to run are much better, due to the
more recent development that has gone into it.

Cheers,
Moritz

-- 
Moritz Lenz
https://deploybook.com/ -- https://perlgeek.de/ -- https://perl6.org/

Re: grammars and indentation of input

2016-09-13 Thread Patrick R. Michaud

I don't have an example handy, but I can categorically say that
Perl 6 grammars are designed to support exactly this form of parsing.
It's almost exactly what I did in "pynie" -- a Python implementation
on top of Perl 6.  The parsing was done using a Perl 6 grammar.

If I remember correctly, Pynie had , , and 
grammar rules.  The grammar kept a stack of known indentation levels.
The  rule was a zero-width match that would succeed when it
found leading whitespace greater than the current indentation level
(and push the new level onto the stack).  The  rule
was a zero-width match that succeed when the leading whitespace
exactly matched the current indentation level.  And the 
rule would be called when  and  no longer 
matched, popping the top level off the stack.

So the grammar rule to match an indented block ended up looking
something like (I've shortened the example here):

token suite {

[   ]*
[  |  ]
}

A python "if statement" then looked like:

rule if_stmt {
'if'  ':' 
[ 'elif'  ':'  ]*
[ 'else' ':'  ]?
}

where the  subrules would match the statements or block
of statements indented within the "if" statement.

However, all of , , and  were written using
"normal" (non-regular expression) code.  Perl 6 makes this easy; since 
grammar rules are just methods in a class (that have a different code
syntax), you can create your own methods to emulate a grammar rule.  
The methods simply need to follow the Cursor protocol; that is, 
return Match objects indicating success/failure/length of whatever has 
been parsed at that point.

I hope this is a little useful.  If I can dig up or recreate a more 
complete Python implementation example sometime, I'll post it.

Pm

On Tue, Sep 13, 2016 at 01:13:45PM +0200, Theo van den Heuvel wrote:
> Hi all,
> 
> I am beginning to appreciate the power of grammars and the Match class. This
> is truly a major asset within Perl6.
> 
> I have a question on an edge case. I was hoping to use a grammar for an
> input that has meaningful indented blocks.
> I was trying something like this:
> 
>   token element { <.lm> [  | $=[ ' '+ ] )> ] }
>   token lm { ^^ ' '**{$cur-indent} } # skip up to current indent level
> 
> My grammar has a method called within the level rule that maintains a stack
> of indentations and sets a $cur-indent.
> I can imagine that the inner workings of the parser (i.e. optimization)
> frustrate this approach.
> Is there a way to make something like this work?
> 
> Thanks,
> Theo
> 
> -- 
> Theo van den Heuvel
> Van den Heuvel HLT Consultancy

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Bennett Todd

Hostile or not, thanks for your informative reply.

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Bennett Todd

Thank you, very much. Yes, I'm disappointed, but I'd rather know.

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Aaron Sherman

>
> Having the minutia of the programmatic run-time state of the parse then
> influence the parse itself, is at the heart of the perl5 phenomenon "only
> Perl can parse perl"

I don't mean to be hostile, but you're demonstrably wrong, here. (also it's
"only perl can parse Perl" as in, only the "perl" implementation can parse
the Perl language).

There are currently about 2.5 implementations of Perl 6, and while you
could backhandedly claim that only Perl 6 can parse Perl 6 (because it's
specced as a self-hosting language whose spec is actually written in Perl
6), the reality is that the parts that aren't written in Perl 6 can be
written in just about anything (with C/MoarVM and JVM implementations
working just fine).

It's not a context sensitive grammar that was the issue with Perl 5, it was
the lack of a specification outside of the primary implementation.

Aaron Sherman, M.:
P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com
Toolsmith, developer, gamer and life-long student.

On Tue, Sep 13, 2016 at 10:35 AM, Bennett Todd 
wrote:

> Having the minutia of the programmatic run-time state of the parse then
> influence the parse itself, is at the heart of the perl5 phenomenon "only
> Perl can parse perl", which I rather hope isn't going to be preserved in
> perl6.
>

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Patrick R. Michaud

On Tue, Sep 13, 2016 at 10:35:01AM -0400, Bennett Todd wrote:
> Having the minutia of the programmatic run-time state of the parse then 
> influence the parse itself, is at the heart of the perl5 phenomenon "only 
> Perl can parse perl", which I rather hope isn't going to be preserved in 
> perl6.

You may be disappointed to read this:  Not only is this feature preserved in 
Perl 6... it's something of a prerequisite.  It is what is required for a truly 
dynamic language that has many things happening in BEGIN blocks (i.e., things 
get executed even before you finish compiling the thing you're working on) and 
that allows dynamically adding new statement types and language features to the 
grammar.

When implementing Perl 6, I think many of us aimed to minimize the amount of 
"runtime" things that happened during the parse... only to discover that we 
actually had to embrace and formalize it instead.

Pm

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Theo van den Heuvel


Hi Bennett,

There are many situations that require non-contextfree languages. Even 
though much of these could be solved in the AST-building step (called 
'transduction' in my days) instead of the parsing step, that does not 
solve all cases. I am just wondering if and to what extent we can parse 
non-CF languages with perl6. The reference to Perl5 is not appropriate, 
because a programming language is supposed to be designed to be 
relatively easy to parse. The language designer will have the need for 
an interpreter for the language high on his/her list of priorities.


It all boils down to the question: does a Grammar allow a rule that 
depends on a function instead of being constant.
I am not shocked if it is impossible, I just want to know how far perl6 
takes me.


Theo

Bennett Todd schreef op 2016-09-13 16:35:

Having the minutia of the programmatic run-time state of the parse
then influence the parse itself, is at the heart of the perl5
phenomenon "only Perl can parse perl", which I rather hope isn't going
to be preserved in perl6.

Re: Fwd: Re: grammars and indentation of input

2016-09-13 Thread Bennett Todd

Having the minutia of the programmatic run-time state of the parse then 
influence the parse itself, is at the heart of the perl5 phenomenon "only Perl 
can parse perl", which I rather hope isn't going to be preserved in perl6.

Fwd: Re: grammars and indentation of input

2016-09-13 Thread Theo van den Heuvel




Thanks Timo and Brian,

both examples are educational. However, they have a common limitation in 
that they both perform their magic after a Match object has been 
created. I was trying to influence the parsing step itself.
I am experimenting to find if I can influence the parsing process 
programmatically. Indentation is just an example here.
The stack of indentation levels is maintained fine, but I cannot seem to 
use the knowledge of the current indentation to

affect the rule for the left margin.

Theo

Timo Paulssen schreef op 2016-09-13 13:56:
I haven't read your code, but your question immediately made me think 
of

this module:

https://github.com/masak/text-indented

Would be interested to hear if this helps you!
  - Timo

Re: grammars and indentation of input

2016-09-13 Thread Aaron Sherman

I don't see why optimization would frustrate this approach. You are doing
the correct thing as far as I can tell, but with one exception. The current
implementation (last I checked) was sometimes slow in binding values. You
might need to force it between an assignment and passing a bound match as a
parameter by inserting an empty block. You can see this documented and used
here:

http://examples.perl6.org/categories/parsers/SimpleStrings.html

Aaron Sherman, M.:
P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com
Toolsmith, developer, gamer and life-long student.

On Tue, Sep 13, 2016 at 7:13 AM, Theo van den Heuvel 
wrote:

> Hi all,
>
> I am beginning to appreciate the power of grammars and the Match class.
> This is truly a major asset within Perl6.
>
> I have a question on an edge case. I was hoping to use a grammar for an
> input that has meaningful indented blocks.
> I was trying something like this:
>
>   token element { <.lm> [  | $=[ ' '+ ] )> ] }
>   token lm { ^^ ' '**{$cur-indent} } # skip up to current indent level
>
> My grammar has a method called within the level rule that maintains a
> stack of indentations and sets a $cur-indent.
> I can imagine that the inner workings of the parser (i.e. optimization)
> frustrate this approach.
> Is there a way to make something like this work?
>
> Thanks,
> Theo
>
> --
> Theo van den Heuvel
> Van den Heuvel HLT Consultancy
>

Re: coded size limits on Perl data types?

2016-09-13 Thread Timo Paulssen

On 09/13/2016 03:12 PM, Timo Paulssen wrote:> If one big integer is
allowed to be 14 gigabytes big (if we use the > default of 28 bits per
"mp digit"; it's also possible to use 31 or > 60.), we can still safely
say "limited only by memory" for now.

Something else that makes this pretty difficult is that our big ints
currently act immutable, so if we ++ a number all the way up to 14
gigabytes, we'll keep around a whole bunch of older big ints until the
GC thinks it has to kick in ...

If you have an exabyte of RAM or a significant amount of swap space, you
ought to be able to work with this, though.

Be prepared for The Slow, though :)

Re: coded size limits on Perl data types?

2016-09-13 Thread Aaron Sherman

It's also the case that there's no language dependency on any given
implementation of larger ints. We could move to a format that, though
slower, would support truly arbitrary sized integers in the future, should
there become a real need.

But I can assure you that such a need isn't likely. Working with very large
numbers doesn't really work the same as working with small numbers. When
you want to understand a number like the power-tower of 3s that is 3^3^3
high (a mammoth number, but only the first step in the 64-iteration process
of producing Graham's number), representing it as a region of memory isn't
really interesting. What you want to know about it are things that there
are other ways to calculate, such as its modulus by certain primes, its log
in various bases, etc.

Aaron Sherman, M.:
P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com
Toolsmith, developer, gamer and life-long student.

On Tue, Sep 13, 2016 at 9:12 AM, Timo Paulssen  wrote:

> On 09/13/2016 02:26 PM, Elizabeth Mattijsen wrote:
> >> On 13 Sep 2016, at 14:15, Timo Paulssen  wrote:
> >>
> >> I'll answer based on the data structures MoarVM uses internally:
> >>
> >> On 09/13/2016 05:13 AM, Darren Duncan wrote:
> >>
> >>> (Pretend the actual hardware has infinite memory so we wouldn't run
> >>> out of hardware first.)
> >>>
> >>> 1.  Maximum size of a (non-machine) integer?
> >>
> >> We're using libtommath, which declares the "used" and "alloc" fields of
> the mp_int as "int", iow a 32bit signed (???) integer.
> > eh, but Int is supposed to be a bigint, only limited by memory, no?
> >
> >
> > Liz
>
> If one big integer is allowed to be 14 gigabytes big (if we use the
> default of 28 bits per "mp digit"; it's also possible to use 31 or 60.),
> we can still safely say "limited only by memory" for now.
>

Re: coded size limits on Perl data types?

2016-09-13 Thread Timo Paulssen

On 09/13/2016 02:26 PM, Elizabeth Mattijsen wrote:
>> On 13 Sep 2016, at 14:15, Timo Paulssen  wrote:
>>
>> I'll answer based on the data structures MoarVM uses internally:
>>
>> On 09/13/2016 05:13 AM, Darren Duncan wrote:
>>
>>> (Pretend the actual hardware has infinite memory so we wouldn't run
>>> out of hardware first.)
>>>
>>> 1.  Maximum size of a (non-machine) integer?
>>
>> We're using libtommath, which declares the "used" and "alloc" fields of the 
>> mp_int as "int", iow a 32bit signed (???) integer.
> eh, but Int is supposed to be a bigint, only limited by memory, no?
>
>
> Liz

If one big integer is allowed to be 14 gigabytes big (if we use the
default of 28 bits per "mp digit"; it's also possible to use 31 or 60.),
we can still safely say "limited only by memory" for now.

Re: grammars and indentation of input

2016-09-13 Thread Brian Duggan

I've also recently been experimenting with parsing an
indent-based language -- specifically, a small subset
of Slim () -- I push to a stack
when I see a tag, and pop based on the depth of the
indendation.

Here's a working example:

  https://git.io/vig93

Brian

Re: coded size limits on Perl data types?

2016-09-13 Thread Elizabeth Mattijsen

> On 13 Sep 2016, at 14:15, Timo Paulssen  wrote:
> 
> I'll answer based on the data structures MoarVM uses internally:
> 
> On 09/13/2016 05:13 AM, Darren Duncan wrote:
> 
> > (Pretend the actual hardware has infinite memory so we wouldn't run
> > out of hardware first.)
> > 
> > 1.  Maximum size of a (non-machine) integer?
> 
> 
> We're using libtommath, which declares the "used" and "alloc" fields of the 
> mp_int as "int", iow a 32bit signed (???) integer.

eh, but Int is supposed to be a bigint, only limited by memory, no?


Liz

Re: coded size limits on Perl data types?

2016-09-13 Thread Timo Paulssen

I'll answer based on the data structures MoarVM uses internally:

On 09/13/2016 05:13 AM, Darren Duncan wrote:> (Pretend the actual
hardware has infinite memory so we wouldn't run > out of hardware
first.) > > 1.  Maximum size of a (non-machine) integer? We're using
libtommath, which declares the "used" and "alloc" fields of the mp_int
as "int", iow a 32bit signed (???) integer.

> 2.  Maximum number of elements in an array? MVMArray declares the "elems", 
> "start" and "ssize" fields to be
MVMuint64, so they can become quite a bit bigger than strings.

> 3.  Maximum number of elements in a hash? the "uthash" library we're using 
> declares the "num_buckets" and
"num_items" fields as unsigned (so 32bit unsigned integer).

> 4.  Maximum number of bytes/codepoints/etc in a character string? MVMString 
> declares its "num_graphs" as a MVMuint32, but a graph can be
multiple codepoints and as such multiple bytes when encoded.

> Following the above, does the Perl 6 language specify any such > limits, or 
> does it define the above things to be infinite? I don't
think it does.

Hope that helps!
  - Timo

Re: grammars and indentation of input

2016-09-13 Thread Timo Paulssen

I haven't read your code, but your question immediately made me think of
this module:

https://github.com/masak/text-indented

Would be interested to hear if this helps you!
  - Timo

grammars and indentation of input

2016-09-13 Thread Theo van den Heuvel


Hi all,

I am beginning to appreciate the power of grammars and the Match class. 
This is truly a major asset within Perl6.


I have a question on an edge case. I was hoping to use a grammar for an 
input that has meaningful indented blocks.

I was trying something like this:

  token element { <.lm> [  | $=[ ' '+ ] )> ] 
}

  token lm { ^^ ' '**{$cur-indent} } # skip up to current indent level

My grammar has a method called within the level rule that maintains a 
stack of indentations and sets a $cur-indent.
I can imagine that the inner workings of the parser (i.e. optimization) 
frustrate this approach.

Is there a way to make something like this work?

Thanks,
Theo

--
Theo van den Heuvel
Van den Heuvel HLT Consultancy

unsubsribe

Re: coded size limits on Perl data types?

Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: grammars and indentation of input

Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Re: Fwd: Re: grammars and indentation of input

Fwd: Re: grammars and indentation of input

Re: grammars and indentation of input

Re: coded size limits on Perl data types?

Re: coded size limits on Perl data types?

Re: coded size limits on Perl data types?

Re: grammars and indentation of input

Re: coded size limits on Perl data types?

Re: coded size limits on Perl data types?

Re: grammars and indentation of input

grammars and indentation of input

23 matches

Site Navigation

Mail list logo

Footer information