from:"Tyson Whitehead"

Re: Linking completely statically

2020-08-10 Thread Tyson Whitehead

On Mon, 10 Aug 2020 at 03:12, Bardur Arantsson  wrote:

> Once you have that bit set up, you can copy *most* of the libraries
> .so's you're using to that folder on the install target and they'll be
> loaded from there. (Copying libc.so is ill advised, but most of the
> others will work just fine. I believe ld-linux-... is also never ever
> loaded from there since that *is* the dynamic loader.)

Not 100% certain, but I think the issue with libc may be that it may be
somewhat tied to the ld-linux.

You can actually copy everything, including ld-linux, though. You then
invoke the program via ld-linux (it's actually an executable -- see the
ld.so page)

./ld-linux-x86_64.so.2 --library-path 
  ...

If you wrap your binaries with a shell script it is actually pretty clean
(make sure to exec so you don't windup with the shell sitting between
parent and childs that can cause some subtle issues)

The only real breakage I've seen from this is when you try and run on too
old of a kernel for your libc or the remote system has an unusual nsswitch
plugin you didn't provide.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: [Haskell-cafe] Decorating exceptions with backtrace information

2020-05-14 Thread Tyson Whitehead

On Tue, 12 May 2020 at 17:30, Henning Thielemann <
lemm...@henning-thielemann.de> wrote:

> From your list of examples I deduce that the proposal is about programming
> errors. But we have HasCallStack for that one. How does the proposal
> improve or alter the HasCallStack solution? And how does it relate to the
> IO exception system with hierarchical exceptions and SomeException and so
> on?
>

As a parallel item, maybe it would be good if incomplete patterns could
have a HasCallStack constraint so the current "Non-exaustive patterns in
function foo" message could be extended with a helpful backtrace? If a
programmer doesn't want this, then they could use complete matches?
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: uninstalling libraries

2017-11-13 Thread Tyson Whitehead

I used to fight with a combination of system and cabal installed packages
under debian testing.

Then, about a year ago, the frustration became too much and I switched to
using nix, intero in emacs, and stack with the nix resolver, all installed
with nix on my debian machine.  Couldn't be happier.

If I want a new dependency, I just add it to my cabal file, rerun
cabal2nix, restart intero and it all just works. Talk about a vastly
improved experience over what I used to have.

As a major bonus I now also run the exact same setup on our HPC cluster,
which runs centos 6, and I can switch between them seamlessly.

Cheers -Tyson

PS: The only gotcha I ran into so far is that stack/intero installs a
ghcpaths package outside of nix on first startup that can causes conflicts
with some packages. Turns out you can just manually remove it though to
resolve this, so no big deal.

On Sun, Nov 12, 2017, 23:52 Brandon Allbery,  wrote:

> cabal and stack, and in the case of stack, cabal new-build, and possibly
> cabal sandboxes, you probably shouldn't be trying to uninstall.
>
> And yes, the data lines are telling ghc what to compile / link with, not
> files but command line inclusions. And this will be especially messy on OS
> X because of the need to group .dylibs together to avoid making the link
> commands section too large with multiple RPATH entries. (Which will also
> complicate uninstallation there.)
>
> On Sun, Nov 12, 2017 at 11:45 PM, Evan Laforge  wrote:
>
>> On Sun, Nov 12, 2017 at 8:14 PM, Brandon Allbery 
>> wrote:
>> > This is something of a nasty problem, considering that storing uninstall
>> > information separately is not particularly robust. Perhaps ghc-pkg
>> should,
>> > if it doesn't already, support extension fields that e.g. cabal can use
>> to
>> > store uninstall information. (But even that potentially has problems,
>> given
>> > that people are known to copy package registration information between
>> > package databases. If there is uninstall information in there, what
>> happens
>> > if someone uninstalls via one or the other copy?)
>>
>> Aren't packages only allowed to install a restricted set of things
>> into a restricted set of places?  We have the code (.hi, .a, .so,
>> etc., in import-dirs / library-dirs), possibly library-specific data
>> (I assume that's data-dir), and haddock (haddock-html and
>> haddock-interfaces, awkwardly in separate places).
>>
>> One problem is that I don't fully understand what those fields mean,
>> is there documentation somewhere?  And then the fact that these are
>> all plural so presumably you could have a lot of them, what is that
>> for?
>>
>> I'm guessing library-dirs means something like "put this on your -L
>> line" so it's clearly a mistake to interpret that as "here's where I
>> put the library", and you'll have things like /usr/local/lib if you
>> need to link external libraries.
>>
>> Is there any more complicated install plan than put *.a, *.so, *.hi in
>> $root/lib/$ghc/$package-$id, put haddock in
>> $root/share/doc/$ghc/$package, put ad-hoc junk in
>> $root/share/$ghc/$package?  I assume there must be, but who's doing
>> that and why?  If it's OS packages, then they have their own uninstall
>> mechanisms, presumably not ghc's problem.
>>
>
>
>
> --
> brandon s allbery kf8nh   sine nomine
> associates
> allber...@gmail.com
> ballb...@sinenomine.net
> unix, openafs, kerberos, infrastructure, xmonad
> http://sinenomine.net
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Tyson Whitehead


On 14/05/16 02:31 PM, Harendra Kumar wrote:

The difference seems to be entirely due to memory pressure. At list size 1000 
both pure version and IO version perform equally. But as the size of the list 
increases the pure version scales linearly while the IO version degrades 
exponentially. Here are the execution times per list element in ns as the list 
size increases:

Size of list  Pure   IO
1000   8.7  8.3
1 8.7  18
10   8.8  63
100 9.3  786

This seems to be due to increased GC activity in the IO case. The GC stats for 
list size 1 million are:

IO case:   %GC time  66.1%  (61.1% elapsed)
Pure case:   %GC time   2.6%  (3.3% elapsed)

Not sure if there is a way to write this code in IO monad which can reduce this 
overhead.


Something to be aware of is that GHC currently can't pass multiple return 
values in registers (that may not be a 100% accurate statement, but a 
reasonable high level summary, see ticket for details)

https://ghc.haskell.org/trac/ghc/ticket/2289

This can bite you with with the IO monad as having to pass around the world 
state token turns single return values into multiple return values (i.e., the 
new state token plus the returned value).

I haven't actually dug into your code to see if this is part of the problem, 
but figured I would mention it.

Cheers!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: Class instance specificity order (was Re: Fundeps and type equality)

2013-01-11 Thread Tyson Whitehead

On January 11, 2013 13:55:58 Simon Peyton-Jones wrote:
> |  The -XOverlappingInstances flag instructs GHC to allow more than one
> |  instance to match, provided there is a most specific one. For example,
> |  the constraint C Int [Int] matches instances (A), (C) and (D), but the
> |  last is more specific, and hence is chosen. If there is no most-specific
> |  match, the program is rejected."
> |  
> |  What it doesn't says is how "most-specific match" is computed.
> 
> An instance declaration (A) is "more specific" than another (B) iff the
> head of (A) is a substitution instance of (B). For example
> 
> (A)   instance context1 => C [Maybe a]
> (B)   instance context2 => C [b]
> 
> Here (A) is more specific than (B) because we can get (A) from (B) by
> substituting b:=Maybe a.
> 
> Does that make it clear?  If so I will add it to the manual.

Thanks Simon.  That is exactly what I was looking for.  It is clear now.

> |  not cases are so clear.  For example, which of
> |  
> |  instance context1 => C Int b
> |  instance context2 => C a   [[b]]
> |  
> |  does C Int [[Int]] match best against?
> 
> Neither is instance is more specific than the other, so C Int [[Int]] is
> rejected.

This was what I was wondering.

> |  If there isn't currently a good
> |  way to resolve this, I would like to suggest the type-shape measure I
> |  proposed in that paper I wrote up awhile back could be used.
> 
> I think the current design is reasonable, and I don't want to complicate it
> unless there is a very compelling reason to do so.

Applying the type-shape measure to resolve this boils the problem down to what 
unification requires

C Int b   --(b = [[Int]])-->   C Int [[Int]]
C a [[b]]   --(a = Int, b = Int)-->   C Int [[Int]]

The assignments required for the first (b = [] ([] Int)) involve three terms.  
The assignments required for the second (a = Int and b = Int) involve two 
terms.  This means the second wins out as being the simplest.

It's easy to compute, and I believe it completely subsumes GHCs current 
definition of more specific.  The fact that we can get (A) from (B) by doing 
substitutions on (B) implies that something that unifies against both (A) and 
(B) will involve more terms in the substitutions done in unifying against (B).

The complexity of the paper was in the derivation of the measure.  The 
implication of the results is really really simple.  Just count terms.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

RE: Class instance specificity order (was Re: Fundeps and type equality)

2013-01-10 Thread Tyson Whitehead

On Thu, 2013-01-10 at 22:17 +, Simon Peyton-Jones wrote:
> Is 
> http://www.haskell.org/ghc/docs/latest/html/users_guide/type-class-extensions.html#instance-overlap
>  
> insufficiently clear?  If so, let's clarify it.

Thanks for getting back to me Simon.  The document says

"For example, consider these declarations:

instance context1 => C Int a where ... -- (A)
instance context2 => C a   Bool  where ... -- (B)
instance context3 => C Int [a]   where ... -- (C)
instance context4 => C Int [Int] where ... -- (D)

...

The -XOverlappingInstances flag instructs GHC to allow more than one
instance to match, provided there is a most specific one. For example,
the constraint C Int [Int] matches instances (A), (C) and (D), but the
last is more specific, and hence is chosen. If there is no most-specific
match, the program is rejected."

What it doesn't says is how "most-specific match" is computed.  While it
is pretty clear in the given example (one matches exactly), not all
cases are so clear.  For example, which of

instance context1 => C Int b
instance context2 => C a   [[b]]

does C Int [[Int]] match best against?  If there isn't currently a good
way to resolve this, I would like to suggest the type-shape measure I
proposed in that paper I wrote up awhile back could be used.

Thanks!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Class instance specificity order (was Re: Fundeps and type equality)

2013-01-10 Thread Tyson Whitehead

On January 10, 2013 13:56:02 Richard Eisenberg wrote:
> Class instances that overlap are chosen among by order of specificity;

Sorry to jump in the middle here, but this caught my attention as this sort of 
specificity determination is exactly what I had in mind when I was working on 
my "The shape of things: a reason for type preference" paper.

I would love to know what approach GHC is currently taking to determine this.  
The document gives the example of matching C Int [Int] against

C Int a
C a Bool
C Int [a]
C Int [Int]

This is easy though.  Obviously the last match is best because it is exact.  
What if we have something like C Int [[Int]] against

C Int b
C a [[b]]

though.  How is the best match determined in this case?

Thanks!  -Tyson


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries by default and GHC 7.8

2012-11-28 Thread Tyson Whitehead

On November 27, 2012 22:02:33 Jens Petersen wrote:
> > GHC HEAD now has support for using dynamic libraries by default (and in
> > particular, using dynamic libraries and the system linker in GHCi) for a
> > number of platforms.
> 
> I am very happy to hear this news.
> 
> I have long been a quiet proponent for defaulting ghc and Cabal to
> shared libraries and dynamic linking for Linux.
> I think Fedora was probably the first Linux distro to enable ghc
> shared libs and use dynamic linking for executables in our packages.
> Having ghci and cabal linking benefiting from dynamic linking seems a
> positive step forward indeed IMHO.
> I hope we will see ghc shared libs support spreading also to other
> Linux archs (particularly ARM).

Hi Jens,

I was curious how Fedora has managed to go dynamic given the various issues 
that the Debian team has brought up.

Is the situation on Fedora somehow different?  Is there something clever you 
are doing that gets around this issues?

Thanks!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries by default and GHC 7.8

2012-11-28 Thread Tyson Whitehead

On November 28, 2012 04:45:57 Joachim Breitner wrote:
> Am Dienstag, den 27.11.2012, 21:57 -0500 schrieb Tyson Whitehead:
> > I was so excited for a bit thinking that this would finally mean that
> > Debian would move to a dynamic system.  Every haskell binary being 10s
> > of MBs (e.g., pandoc = 25MB executable) makes it look kind of bad.
> 
> its not like dynamic libraries make the bytes disappear – the
> (non-Haskell-developer) user who wants to use pandoc still has to
> install all these bytes, but now they just come split in a dozen of
> packages.

My point was more trying to get at the idea that maybe we don't need a 
separate copy of most of the bytes in each application.

> Or gix-annex, a more and more popular Haskell application: Building it
> requires 94 Haskell library packages. Now imagine this to be dynamically
> built: Now installing git-annex will require 94 strage sounding packages
> that the user most likely has no idea what they are about, and chances
> are high that there is no other packages requiring these shared
> libraries, making most of the benefit of shared libraries moot.
> 
> Now, if Haskell was as popular as C and the user _would_ run several
> different processes at once that could share the shared library, this
> would be interesting. At the moment, I do not see how dynamically built
> Haskell programs are in the interest of our user.

I guess this is really a question of how many haskell programs are there being 
used out there.  From the looks of popcon results, there isn't a whole lot of 
take up on anything at moment apart from ghc, xmond, and pandoc.

> > I was left with the impression that we were going to have this back in
> > 2010 just as soon as squeeze got out the door...  :)
> 
> It seems that noone cared enough about that, but any help is welcome.
> Two things to do: Patch haskell-devscripts to build the -dyn ways, and
> manually adding the additional package stance to the debian/contol files
> (if it is to be decided that the -dyn libraries should reside in
> packages of their own. If we decide to include them in the regular
> packages, this is not needed.)

Fair enough.

If I was update my 2010 patch so it worked again at some point in the upcoming 
year (I don't have the time to do this at the moment), would there be a 
reasonable chance it would seem worthwhile to include it at this point?

Please feel free to say no here if that is the case.  I realize that maybe in 
a few years, when there are even more haskell applications, we can revisit the 
again, and possibly then it will make more sense.

Cheers!  -Tyson

PS:  I don't mean to be critical here.  You've done a lot of work supporting 
haskell under Debian, and it's all volunteer.  I really appreciate that.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries by default and GHC 7.8

2012-11-27 Thread Tyson Whitehead

On Wed, 2012-11-28 at 00:28 +0100, Joachim Breitner wrote:
> Am Dienstag, den 27.11.2012, 14:52 + schrieb Ian Lynagh:
> > The various issues are described in a wiki page here:
> > http://hackage.haskell.org/trac/ghc/wiki/DynamicByDefault
> > 
> > If you have a few minutes to read it then we'd be glad to hear your
> > feedback, to help us in making our decisions
> 
> here comes the obligatory butting in by the Debian Haskell Group:
> 
> Given the current sensitivity of the ABI hashes we really do not want to
> have Programs written in Haskell have a runtime dependency on all the
> included Haskell libraries. So I believe we should still link Haskell
> programs statically in Debian.
>
> Hence, Debian will continue to provide its libraries built the static
> way.

I was so excited for a bit thinking that this would finally mean that
Debian would move to a dynamic system.  Every haskell binary being 10s
of MBs (e.g., pandoc = 25MB executable) makes it look kind of bad.

>From our previous discussion on this

http://lists.debian.org/debian-haskell/2010/03/msg00120.html

it seemed a big part of the problem is you really need to be able to
have multiple versions of the same package installed.

At the time, I believe the only way to do this was to include part of
the hash in the name, but that meant you had to constantly be creating
new packages which Debian didn't make easy.

Maybe it is time to explain the problem to the Debian packaging people
and see if they have any ideas from their level?

> Building them also in the dynamic way for the sake of GHCi users seems
> possible.

I was left with the impression that we were going to have this back in
2010 just as soon as squeeze got out the door...  :)

http://lists.debian.org/debian-haskell/2010/03/msg00155.html

Cheers!  -Tyson

PS:  Another solution might to move up one level.  That is, use the
packages in some way other than for actual standard packaging.

For example, they could be taken to mean what the user wants built.  The
installed packages then becomes the roots in a NIX style store.

Another example, do something like akmods system under redhat.  The
package actually builds a new locally generated package that is then
installed to get around the can't easily create new package issue.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: How to describe this bug?

2012-07-11 Thread Tyson Whitehead

On July 11, 2012 04:51:50 Christian Maeder wrote:
> Is it not enough to store floats into memory just before equality tests
> (or add rounding in the instance definitions of Float and Double in Eq
> and Ord)?

You have to be 100% consistent in how you do every operations in all cases 
otherwise different levels of rounding errors will creep into the results.

It isn't too hard to imagine a floating point expression getting inlined 
somewhere, and the compiler generating code to evalulate it all in registers.  
Intermediate operations will then be done to 80 bit precision.

Elsewhere, it doesn't get inlined and the compiler generates code to store 
intermediate results in memory.  Intermediate operations will then be done to 
32 bit precision.  Different results will occur on the rounding boundaires.

Always storing and reloading after every operations will give you consistent 
results.  I guess the other option is to disable inlining etc. or somehow 
ensure they are applied consistently in all use cases of an expression.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: How to describe this bug?

2012-07-10 Thread Tyson Whitehead

On July 10, 2012 10:39:41 Colin Adams wrote:
> Sure they would be better modelled that way, but the whole point of using
> floating point arithmetic is to sacrifice accuracy for performance, is it
> not?

True.  I just find it interesting that some types have a builtin Nothing value.

Some further examples are pointer (where NULL is Nothing) and 
clamped/saturation arithmetic (where min and max values are Nothing).

It would be nice if somehow they could be unified at the top level without the 
performance penalty associated with a genuine Maybe value.

Another possibility might be to consider NaN to encode bottom in Float#.  When 
you constructed Float from Float# with a NaN it would give bottom.

Would the existance of unboxed lifted types be a problem?

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: How to describe this bug?

2012-07-10 Thread Tyson Whitehead

On July 10, 2012 09:28:27 Christian Maeder wrote:
> Am 10.07.2012 13:06, schrieb Sönke Hahn:
> > I've attached the code. The code does not make direct use of
> > unsafePerformIO. It uses QuickCheck, but I don't think, this is a
> > QuickCheck bug. The used Eq-instance is the one for Float.
> 
> The Eq-instance for floats is broken wrt NaN
> 
> Prelude> (0/0 :: Float) == 0/0
> False
> 
> I do not know if you create NaN in your tests, though.

Would that really be broken though?  NaN can arrise from many contexts (e.g., 
sqrt(-1)), so it would also not make much sense to return True.

The IEEE standard actually defines a mutually exclusive fourth "unordered" 
state wrt to NaNs for comparisons (in addition to lesser, greater, and equal).

I would like to suggest native floating point might be better modelled as 
"Maybe Float", with NaN being the builtin "Nothing", but leaves out Inf.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Modified layout for lambda-case

2012-07-09 Thread Tyson Whitehead

It seems my suggestion isn't getting too much traction.

It occurs to me this may be because I jumped right into layout rules and 
didn't first give some simple motivating examples to get the idea across.


SHORT VERSION:

Assume '\' is a grouping construct.  I proposed always interpreting

  ... \x -> do  (3a)
...

as

  ... \ { x -> do { ... } } (3b)

and never as

  ... \ { x -> do { }; ... }(3c)

or

  ... \ { x -> do { } } ... (3d)

This would be apposed to our current situation of all these being possible 
depending on if the second line starts past (3c), at (3b), or before (3d) 'x'.

We loose nothing is at best empty groups do nothing (let and where) and at 
worst are errors (do and case) as in the above.


LONGER VERSION:

What I was suggesting was to change the layout rule that gives you empty 
groupings.  Currently nonsensical nested empty where and let clauses like

  let { x = ... where { }; ... }(1a)
  do { let { }; ... }   (2a)

can be expressed using layouts

  let x = ... where (1b)
  ...

  do let(2b)
 ...

because each new grouping has to be indented further than the previous.  This 
trips us up if we change '\' into a grouping construct as the common

  ... \x -> do  (3a)
...

will get interpreted incorrectly as

  ... \ { x -> do { } } ...  (3d)

It all seems silly though.  At best empty grouping does nothing (let and 
where) and at worst is an error (do and case).  Why then not always go with

  ... \ { x -> do { ... } } (3b)

That is, if a line ends in a grouping clauses and the next line is further 
indented, then what follows is the associated grouped statements.

It's also easier to read.  Further indented lines following a grouping clause 
are associated with that clause.  No need to identify other potential grouping 
clauses and check relative indentation levels.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Call to arms: lambda-case is stuck and needs your help

2012-07-07 Thread Tyson Whitehead

On July 7, 2012 00:08:26 Tyson Whitehead wrote:
> The very limited scope of this (i.e., it would only apply to lines that end
> with a grouping construct where the next line is indented further than that
> line) should also address Simon's concerns regarding things like
> 
>f x y = x + y
>  where  -- I just left this where here by accident
> 
>g x = ...
> 
> and
> 
>instance Exception Foo where
>instance Exception Bar

The only thing that would still break is a line that starts multiple 
groupings, where the last is a valid empty groups (i.e., let or where) and the 
next line is further indented then the previous line.

The least grotesque/contrived examples I could think of are

  do let
 stmt1
 stmt2

and

  let f x = ... where
  g y = ...

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Call to arms: lambda-case is stuck and needs your help

2012-07-06 Thread Tyson Whitehead

On July 6, 2012 11:49:23 Tyson Whitehead wrote:
> Currently it depends on the depth of this new level of indentation relative
> to all the groupings started on that line.  I think most people would
> expect it to just apply to the last grouping though.  That is
> 
> where { f x = do {
>   stmt1
>   stmt2
> } }
> 
> mask $ let { x = do {
>   stmt1
>   stmt2
> } }
> 
> The rule in this case would be that if the grouping began on a newline that
> is idented farther then the previous line, the grouping is assocated with
> the grouping token and when it closes, it closes all those deeper than
> itself.

I've thought some more about this and it seems to me that there are two ways 
people might intuitively think about doing grouping via indentation.

1 - the first item is on the same line and subsequent ones are lined up with it

  do stmt1
 stmt2

2 - the first item is on a new line and subsequent ones are lined up with it.

  do
stmt1
stmt2

The current layout engine is targeted at (1).  It appears to do (2), but it is 
not really reliable as things start to go south if the first line happened to 
open more than one grouping (precisely the problem that make '\' a group token 
would introduce in codes).  For an example, consider

  let greet name = do
putStr "hello "
putStrLn name
  in f "world"

It currently translates into

  let { greet name = do {} } putStr "hello " putStrLn name in f "world"

This results in an unituituve "Empty 'do' construct" error message.

I propose we detected (2) and make it work too.  That is, if the line ends 
with a grouping construct and the next line is indented relative to that line, 
then assume we really don't want '{}' and instead always start grouping (even 
if it isn't indented further than other possible groupings also started).

In other words, translate the above into

  let { greet name = do {
putStr "hello";
putStrLn name
  }} in f "world"

This would then correctly handle the problamatic case raised in wiki where

  mask $ \restore -> do
stmt1
stmt2

is in translated into

  mask $ \ { restore -> do {} } stmt1 stmt2

under the current rules if '\' is made a grouping token.

The very limited scope of this (i.e., it would only apply to lines that end 
with a grouping construct where the next line is indented further than that 
line) should also address Simon's concerns regarding things like

   f x y = x + y
 where  -- I just left this where here by accident

   g x = ...

and

   instance Exception Foo where
   instance Exception Bar

Cheers!  -Tyson

PS:  To be fully precise, the modified layout decoder in 9.3 would be

  L (:ts) i (m:ms) = ; : (L ts n (m:ms))   if m = n
  = } : (L (:ts) n ms) if n < m
  L (:ts) i ms = L ts n ms
  L ({n}::ts) i ms = { : (L ts n (n:ms))   if n > i (new rule)
  L ({n}:ts) i (m:ms) = { : (L ts i (n:m:ms)) if n > m  (Note 1)
  L ({n}:ts) i [] = { : (L ts i [n])  if n > 0  (Note 1)
  L ({n}:ts) i ms = { : } : (L (:ts) i ms)   (Note 2)
  L (}:ts)   i (0:ms) = } : (L ts i ms) (Note 3)
  L (}:ts)   i ms = parse-error (Note 3)
  L ({:ts)   i ms = { : (L ts i (0:ms)) (Note 4)
  L (t:ts)   i (m:ms) = } : (L (t:ts) i ms)   if m /= 0 and parse-error(t) 
(Note 5)
  L (t:ts)   i ms = t : (L ts i ms)
  L []   i [] = []
  L []   i (m:ms) = } : L [] i ms if m /= 0 (Note 6)

  http://www.haskell.org/onlinereport/syntax-iso.html

As before, the function 'L' maps a layout-sensitive augmented token stream to 
a non-layout-sensitive token stream, where the augmented token stream includes 
'' and '{n}' to, respectively, give the indentation level of the first token 
on a new line and that following a grouping token not followed by '{'.

This time though, we allow the '{n}' '' sequence (before it was supressed 
to just '{n}').  We also add a new state variable 'i' to track the indentation 
of the  current line.  The new rule now opens a grouping over a newline so 
long as the indentation is greater than the current line.

Upon a less indented line, it will then close all currently open groups with 
an indentation less than the new line.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Call to arms: lambda-case is stuck and needs your help

2012-07-06 Thread Tyson Whitehead

On July 6, 2012 05:25:15 Simon Marlow wrote:
> > Why not just let enclosed scopes be less indented than their outer ones?

Let me be entirely clear about what I was thinking about.  The third case for 
the layout mapping in Section 9.3 of the report is

  L ({n}:ts) (m:ms)  =  { : (L ts (n:m:ms))  if n > m

This function takes a possibly layout sensitive token stream augmented with 
'{n}' for the indentation level of the first token following a grouping token 
that doesn't have a '{' and '' for the indentation level of first tokens 
after newlines that are not already augmented with '{n}'.  The 'L' functions 
maps this to a non-augmented non-layout sensitive stream.

The first argument is the augmented layout stream and the current stack of 
indentations for any groupings in effect.  The rule above inserts a '{' if 
there isn't one after a grouping token and the next token is at a deeper level 
then the current grouping level.  I was proposing to make it always fire on 
indentation (i.e., allow enclosing scopes to be less indented).

  L ({n}:ts) (m:ms)  =  { : (L ts (n:m:ms))  if n > 0

The rest of the '{' insertion rules are for starting the first '{' on any 
indentation after a grouping token not followed by a '{' and for inserting a 
'{}' in all other cases.

  L ({n}:ts) [] =   { : (L ts [n])   if n > 0
  L ({n}:ts) ms =   { : } : (L (:ts) ms)

http://www.haskell.org/onlinereport/syntax-iso.html

> I think this is undesirable.  You get strange effects like
> 
>f x y = x + y
>  where  -- I just left this where here by accident
> 
>g x = ...
> 
> parses as
> 
>f x y = x + y
>  where { -- I just left this empty where here by accident
> 
>g x = ...
>}
> 
> and
> 
>instance Exception Foo where
>instance Exception Bar
> 
> parses as
> 
>instance Exception Foo where {
>  instance Exception Bar
>}
>
> That is, layout contexts that should really be empty end up surprisingly
> swallowing the rest of the file.

These would be okay under the above so long as the following lines are not 
indented.  The issue only arises with nested ones

f x = ...
  where
g y = ...
  where
h y = ...

Now the h gets sucked into the where clause as it is empty and nested.  Using 
the metric of what would most people expect, I agree the above is not ideal 
(although I would hope empty nested where clauses not really in common use).

By this same metric though, I also think things like

where f x = do
  stmt1
  stmt2

mask $ let x = do
  stmt1
  stmt2

being parsed to

where { f x = do {}
} stmt1
   stmt2

mask $ let { x = do {}
  stmt1
  stmt2
}

is also not ideal.  The real underlying issue in both these cases and changing 
'\' to a group token seems to be what happens on single lines with multiple 
groupings when the last grouping actually starts on a newline indented further 
than the proceeding line.

Currently it depends on the depth of this new level of indentation relative to 
all the groupings started on that line.  I think most people would expect it 
to just apply to the last grouping though.  That is

where { f x = do {
  stmt1
  stmt2
} }

mask $ let { x = do {
  stmt1
  stmt2
} }

The rule in this case would be that if the grouping began on a newline that is 
idented farther then the previous line, the grouping is assocated with the 
grouping token and when it closes, it closes all those deeper than itself.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Call to arms: lambda-case is stuck and needs your help

2012-07-05 Thread Tyson Whitehead

On July 5, 2012 10:42:53 Mikhail Vorozhtsov wrote:
> After 21 months of occasional arguing the lambda-case proposal(s) is in
> danger of being buried under its own trac ticket comments. We need fresh
> blood to finally reach an agreement on the syntax. Read the wiki
> page[1], take a look at the ticket[2], vote and comment on the proposals!

If I understand correctly, we currently we have

  \ apat1 ... apatn -> exp

The possibility using '\' as a layout herald (like let, do, etc.)

  \ { apat1 ... apatn -> exp; ... }

is suggested on the wiki, but rejected because code like so

  mask $ \restore -> do
stmt1
...

by translating it into (Section 9.3 of the 98 Report)

  mask $ \ { restore -> do { }
} stmt1

  http://www.haskell.org/onlinereport/syntax-iso.html

The reason for this is

1 - the layout level for '\' is the column of the 'restore' token

2 - the layout level for 'do' would be the column of the first token of 'stmt1'

3 - the '\' level is greater than the potential 'do' level so the fall through 
'{}' insertion rule fires instead of the desired '{' insertion rule

4 - the '\' level is greater than the identation level for the first token of 
'stms1' (now established to not be part of the 'do') so the '}' rule fires

Why not just let enclosed scopes be less indented than their outer ones?

It would then correctly translate the above.  This of course implies that any 
code that requires the original translation (i.e., where the last of multiple 
enclosing blocks should be an empty block) would no longer work.

Is any code actually relying on this though?  It seems like a pretty esoteric 
corner case.  If not, my vote would be to relax this rule and go with '\' 
being a layout hearld with full case style matching (i.e., guards too).

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: parallel garbage collection performance

2012-06-18 Thread Tyson Whitehead

On June 18, 2012 04:20:51 John Lato wrote:
> Given this, can anyone suggest any likely causes of this issue, or
> anything I might want to look for?  Also, should I be concerned about
> the much larger gc_alloc_block_sync level for the slow run?  Does that
> indicate the allocator waiting to alloc a new block, or is it
> something else?  Am I on completely the wrong track?

A total shot in the dark here, but wasn't there something about really bad 
performance when you used all the CPUs on your machine under Linux?

Presumably very tight coupling that is causing all the threads to stall 
everytime the OS needs to do something or something?

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: default instance for IsString

2012-04-25 Thread Tyson Whitehead

On April 25, 2012 12:20:16 Johan Tibell wrote:
> On Wed, Apr 25, 2012 at 8:19 AM, Tyson Whitehead  
wrote:
> > Is there a technical reason this couldn't be done?  The Haskell report
> > only says doing this is not part of haskell.  It doesn't say why.
> 
> I think the problem is incoherence, what if the same Map value got
> used with two different instances of Int?

I'm not sure I follow how allowing control over importing of instances could 
allow a programmer to define multiple instances for the same types.

I would have expected this to result in a link time error as a product of 
multiple declerations (something like a multiple symbol definition) regardless 
of whether any module brings it into scope as a possible candidate for use.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: default instance for IsString

2012-04-25 Thread Tyson Whitehead

On April 25, 2012 04:15:41 Yitzchak Gale wrote:
> The only reason I don't like using OverloadedStrings
> for typing string literals as Text and ByteString
> is that when you turn on OverloadedStrings, you turn
> it on for all types, not just Text and ByteString.
> I don't want to be forced to do that. Because
> all other uses of OverloadedStrings that I have
> seen, and there are many, are ill-advised in my
> opinion. They all should have been quasiquoters.

Maybe what you really want is the ability to control instance imports?

Is there a technical reason this couldn't be done?  The Haskell report only 
says doing this is not part of haskell.  It doesn't say why.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: [saj...@gmail.com: Google Summer of Code: a NUMA wishlist!]

2012-03-28 Thread Tyson Whitehead

On March 28, 2012 12:40:02 Sajith T S wrote:
> Tyson Whitehead  wrote:
> > Intel is more recent to this game.  I believe AMD's last non-NUMA
> > machines where the Athalon XP series and Intel's the Core 2 series.
> > 
> > An easy way to see what you've got is to see what 'numactl --hardware'
> > says.
> 
> Ah, thanks.  I trust this one qualifies?
> 
> $ numactl --hardware
> available: 4 nodes (0-3)
> node 0 cpus: 0 4 8 12 16 20 24 28
> node 0 size: 16370 MB
> node 0 free: 14185 MB
> node 1 cpus: 1 5 9 13 17 21 25 29
> node 1 size: 16384 MB
> node 1 free: 10071 MB
> node 2 cpus: 2 6 10 14 18 22 26 30
> node 2 size: 16384 MB
> node 2 free: 14525 MB
> node 3 cpus: 3 7 11 15 19 23 27 31
> node 3 size: 16384 MB
> node 3 free: 13598 MB
> node distances:
> node   0   1   2   3
>   0:  10  20  20  20
>   1:  20  10  20  20
>   2:  20  20  10  20
>   3:  20  20  20  10

Yup.  For sure.  Here's an example from a 4 socket 16 core non-NUMA Intel Xeon

$ numactl --hardware
available: 1 nodes (0)
node 0 size: 129260 MB
node 0 free: 1304 MB
node distances:
node   0 
  0:  10 

On a NUMA system I believe you should be able to get an idea of the worst case 
penality a program experiences due to having to do it's memory acesses all go 
across the QuickPath Interconnect (Intel)/HyperTransport (AMD) by forcing it 
to execute on one socket while using the memory on another.

$ numactl --cpunodebind=0 --membind=1 

There is also some good information under /proc//numa_maps.  See the man 
page for details, but basically it tells you how many pages are associated 
with each node for each part of the programs address space..

Note that for file backed pages, they don't always reside in the limited node 
due to the system already having mapped them into memory on another node for 
an earlier process.

Appoligies if you are already familair with these items.

Cheers!  -Tyson

PS:  That looks like a pretty sweet box you've got going there.  :)

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: [saj...@gmail.com: Google Summer of Code: a NUMA wishlist!]

2012-03-28 Thread Tyson Whitehead

On March 28, 2012 04:41:16 Simon Marlow wrote:
> Sure.  Do you have a NUMA machine to test on?

My understanding is non-NUMA machines went away when the AMD and Intel moved 
away from frontside buses (FSB) and integrated the memory controllers on die.

Intel is more recent to this game.  I believe AMD's last non-NUMA machines 
where the Athalon XP series and Intel's the Core 2 series.

An easy way to see what you've got is to see what 'numactl --hardware' says.  
If the node distance matrix is not uniform, you have NUMA hardware.

As an example, on a 8 socket Opteron machine (32 cores) you get

$ numactl --hardware
available: 8 nodes (0-7)
node 0 size: 16140 MB
node 0 free: 3670 MB
node 1 size: 16160 MB
node 1 free: 3472 MB
node 2 size: 16160 MB
node 2 free: 4749 MB
node 3 size: 16160 MB
node 3 free: 4542 MB
node 4 size: 16160 MB
node 4 free: 3110 MB
node 5 size: 16160 MB
node 5 free: 1963 MB
node 6 size: 16160 MB
node 6 free: 1715 MB
node 7 size: 16160 MB
node 7 free: 2862 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  20  20  20  20  20  20  20 
  1:  20  10  20  20  20  20  20  20 
  2:  20  20  10  20  20  20  20  20 
  3:  20  20  20  10  20  20  20  20 
  4:  20  20  20  20  10  20  20  20 
  5:  20  20  20  20  20  10  20  20 
  6:  20  20  20  20  20  20  10  20 
  7:  20  20  20  20  20  20  20  10 

On our more traditional NUMA there are 64 nodes and the numbers range from 
10-37.  But it's an older SGI Itanium solution, so that comes with its own set 
of problems, and most modern machines are already out performing it.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: How unsafe are unsafeThawArray# and unsafeFreezeArray#

2012-03-09 Thread Tyson Whitehead

On March 9, 2012 06:44:33 John Meacham wrote:
> A weakness of jhc right now is its stop the world garbage collector,
> so far, I have been mitigating it by not creating garbage whenever
> possible, I do an escape analysis and allocate values on the stack
> when possible, and recognize linear uses of heap value in some
> circumstances and re-use heap locations directly (like when a cons
> cell is deconstructed and another is constructed right away I can
> reuse the spacue in certain cases) but eventually a full GC needs to
> run and it blocks the whole runtime which is particularly not good for
> embedded targets (where jhc seems to be thriving at the moment.). My
> unsafeFreeze and unsafeThaw are currently NOPs. frozen arrays are just
> a newtype of non-frozen ones (see
> http://repetae.net/dw/darcsweb.cgi?r=jhc;a=headblob;f=/lib/jhc-prim/Jhc/Pri
> m/Array.hs)

Here's an idea I've had bouncing around in the back of my head for a couple of 
years.  I haven't found the time to think it out in full detail yet, so I 
appoligize if it has some obvious flaw or has already been done.

The idea is to add a primative/runtime function

  gc :: a -> a

that implements a garbage collection scheme.  That is, say

  gc expression

occurs in the the code somewhere.  At some point expression may need to be 
evaluated to WHNF.  When this occurs, the effect of 'gc' is that all the memory 
allocations required to do this come out of a new pool of memory.

When the reduction is complete, the final value is treated as the root in this 
region and and it and all reachable points in the region are copied out.  The 
region is then released as a whole and evaluation continues.

The idea is this allows the programmer to control where garbage collection 
occurs (it only occurs on reduction of 'gc expression' expressions) and put 
bounds on how long it will take (it will only take as much time as it takes to 
walk the structure of the returned value -- a function of that value).

I believe it is composable and can gives you what you need to reason about 
garbage collection bounds, the only real catch I see is that lazyness could 
make this reasoning hard due to determining exactly were reduction occurs.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Posting etiquette, was Re: Records in Haskell

2012-01-19 Thread Tyson Whitehead

On January 19, 2012 05:14:30 Malcolm Wallace wrote:
> I find it completely unreasonable for a reply to a very long post to quote
> the entire text, only to add a single line at the bottom (or worse,
> embedded in the middle somewhere).

+1

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Unit unboxed tuples

2012-01-11 Thread Tyson Whitehead

On January 11, 2012 08:41:04 Simon Marlow wrote:
> On 10/01/2012 16:18, Dan Doel wrote:
> > Does the difference have to do with unboxed types? For instance:
> > foo :: () ->  Int#
> > foo _ = foo ()
> > bar :: () ->  (# Int# #)
> > bar _ = (# foo () #)
> > 
> > baz = case bar () of _ ->  5  -- 5
> > quux = case foo () of _ ->  5 -- non-termination
> > 
> > Because in that case, either (# Int# #) is lifted, or the Int# is
> > effectively lifted when inside the unboxed tuple. The latter is a bit
> > of an oddity.
> 
> Unboxed types cannot be lifted, so in fact bar compiles to this:
> 
>bar = \_ -> case foo () of x -> (# x #)
> 
> and both baz and quux diverge.

I tried both of these and it seems the lack of a constructor in the case 
expression results in the evaluation not being forced, so neither diverged.

Using a lifted type and adding a construct to the case did the trick though.

{-# LANGUAGE MagicHash, UnboxedTuples #-}

module Main where

import GHC.Exts

g :: () -> Int
g _ = g ()

f :: () -> (# Int #)
f _ = (# g () #)

main_baz :: IO ()
main_baz = putStrLn $ case g () of (I# _) -> "this one diverges"

main_quux :: IO ()
main_quux = putStrLn $ case f () of (# _ #) -> "this one doesn't diverge"

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Unit unboxed tuples

2012-01-10 Thread Tyson Whitehead

On January 10, 2012 12:18:13 wren ng thornton wrote:
> On 1/10/12 11:18 AM, Dan Doel wrote:
> > On Tue, Jan 10, 2012 at 11:14 AM, Dan Doel  wrote:
> >> On Tue, Jan 10, 2012 at 5:01 AM, Simon Marlow  wrote:
> >>> On 09/01/2012 04:46, wren ng thornton wrote:
>  Shouldn't (# T #) be identical to T?
> >>> 
> >>> No, because (# T #) is unlifted, whereas T is lifted.  In operational
> >>> terms, a function that returns (# T #) does not evaluate the T before
> >>> returning it, but a function returning T does.  This is used in GHC
> >>> for example to fetch a
> >>> 
> >>> value from an array without evaluating it, for example:
> >>>   indexArray :: Array# e ->  Int# ->  (# e #)
> 
> With my revised understanding we have the following family of types:
> 
>  T pointed,   lifted,   lazy
>  (# T #)   pointed?,  unlifted, lazy
>  !Tunpointed, lifted,   eager
>  {-#UNPACK#-}!Tunpointed, unlifted, eager
> 
> where the two !T types are only specified in ADTs and through strictness
> analysis. Though this doesn't explain the difference in operational
> behavior for their use as return types.

IIRC, don't functions in the STG machine evaluate their returned values to 
WHNF as the very fact that a (lazy) function has comes under valuation implies 
that its result is to be deconstructed (in a case statement)?

An unboxed tuple would already be in WHNF, so this would not be the case (it 
would be wrong to evaluate the arguments as the case statement forcing the 
valuation may not do more than deconstruct the unboxed tupple).

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Unit unboxed tuples

2012-01-10 Thread Tyson Whitehead

On January 8, 2012 23:49:47 wren ng thornton wrote:
> An alternative is to distinguish, say, (# x #) and its spaceful
> constructor (# #) from the spaceless (##); and analogously for the boxed
> tuples, though that introduces confusion about parentheses for boxing vs
> parentheses for grouping.

I think that sounds pretty good.  Here is another that occured to me today

  (#), (# a #), (# a, b #), (# a, b, c #) ...

If you replace the internal ',' with '#'

  (#), (# a #), (# a # b #), (# a # b # c #), ...

you have number of elements = number of '#' - 1.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Unit unboxed tuples

2011-12-23 Thread Tyson Whitehead

On December 23, 2011 09:37:04 Ganesh Sittampalam wrote:
> On 23/12/2011 13:46, Ian Lynagh wrote:
> > On Fri, Dec 23, 2011 at 01:34:49PM +, Simon Peyton-Jones wrote:
> >> Arguments   Boxed  Unboxed
> >> 3   ( , , )(# , , #)
> >> 2   ( , )  (# , #)
> >> 1
> >> 0   () (# #)
> 
> It's worth mentioning that if you want to write code that's generic over
> tuples in some way, the absence of a case for singletons is actually a
> bit annoying - you end up adding something like a One constructor to
> paper over the gap. But I can't think of any nice syntax for that case
> either.

I believe python uses (expr,) (i.e., nothing following the ,) to distinguish a 
singelton tupple from a braced term.  Not great, but possibly not that bad.

The other option you could do is introduce another unambiguous brace symbol 
for tupples.  The new symbol would be optional except for the singelton.

(- expr, expr, expr -)  =  (expr, expr, expr)
(- expr, expr -)  =  (expr, expr)
(- expr -)  =  
(- -)  =  ()

Nothing has to be done for (# #) as it doesn't have the ambiguity.

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: performance issues in simple arithmetic code

2011-04-27 Thread Tyson Whitehead

On April 27, 2011 23:01:50 Denys Rtveliashvili wrote:
> Question 1: what is the meaning of those magic numbers
> -9223372036854775808, -6677706553, -7418931981405, -8257271295304186?
> Question 2: under which circumstances those strange branches of
> execution will be used and what those results would mean?
> Question 3: why is the Core for 'foo' so different to 'bar'?

The largest representable 64bit negative number in twos complement is 
-9223372036854775808.  It doesn't have a positive counter part as 0 eats one 
out of the range of positive numbers.

The code for "quot x y" therefore checks for the special case "x==minBound" 
and "y==-1" in order to be able to thrown an overFlowError.  I suspect what 
you are seeing is a specialization for this branch.

That is, the start of the overflow case check introduced a branch for 
x==-9223372036854775808.  The compiler then notices that it know both x and y 
here, so it just puts in the answer, resulting in a permanent special case.

Your foo and bar code is so different as direct use of the quotInt# primitive 
avoids this check and hence the resulting specialization for it.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Tracing idea

2011-02-20 Thread Tyson Whitehead

On February 19, 2011 12:11:13 Edward Z. Yang wrote:
> I was thinking it might be useful if we had a per-thread circular buffer in
> memory for which we pushed a pointer to the info table we had just entered.
>  In the event of a crash, you could dump the contents of the buffer to see
> what code had been recently executed.  To reduce overhead, one might only
> need to record this information when we do a tail call, and not when the
> path can be reconstructed from the stack.  (Obviously, you'd need to
> compile in debug mode, and you'd probably want to also add an RTS flag).
> 
> What do people think?
 
I believe a back trace on the actual call stack is generally considered not 
that useful in a lazy language as it corresponds to the evaluation sequence,  
That is, it is demand centric while written code is production centric

http://hackage.haskell.org/trac/ghc/wiki/Debugging/CompiledCode

Here's a link that summarizes some thoughts on production centric back traces

http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Faster Array#/MutableArray# copies

2011-02-18 Thread Tyson Whitehead

On February 17, 2011 20:18:11 Johan Tibell wrote:
>  * Can we use SSE instructions?
> 
>  * Can we get the C memcpy code inlined into the C-- source (crazy, I
> know). If so we could perhaps benefit directly from optimizations in
> libc.

From the standpoint of numerical code, it would be very nice to get some 
vector .  Perhaps this would also give a natural expression of memcpy.

In the spirit of the common "infinite" register file assumption, I've always 
imagined idealized arbitrary-length vector types/operations at the C-- level 
(mapped to reality via chunking it over fixed length vector instructions).

I see this is LLVM supports arbitrary length vectors as a first class type

http://llvm.org/docs/LangRef.html#t_vector
http://llvm.org/docs/LangRef.html#i_add

Looking at the C-- specification (section 2.4 -- page 10)

http://www.cminusminus.org/extern/man2.pdf

it seems this may not be such a good fit as it considers everything as fixed-
size bit collections (bits8, bits16, etc.) and booleans (bool).  Presumably 
memory orientated primitive instructions are also out as they break the strict 
load/store architecture.  Has anyone though of how to do this?

Some half baked suggestions

 - perhaps it should be fixed-size bit collections with repetition, or
 - built in primitives for instructions composition (i.e., a map operation)?

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: libffi was:Re: ANNOUNCE: GHC version 7.0.1

2010-11-18 Thread Tyson Whitehead

On November 18, 2010 05:12:11 Simon Marlow wrote:
> On 17/11/2010 14:34, Christian Maeder wrote:
> > ghc can be built without and with libffi.
> 
> Which build option are you referring to here?  libffi is required for
> FFI support in GHCi, and for FFI "wrapper" imports.  However on x86 and
> x86_64 we don't normally use libffi for wrappers, because we have a
> native implementation that is a bit faster (this is the
> UseLibFFIForAdjustors build option).

I was hoping someone might comment further on the selinux issues,

I woudl gather from this though that the native implementation must now avoid 
the "selinux doesn't like pages that are both executable and writable" issue?

http://hackage.haskell.org/trac/ghc/ticket/738

(it seems like the solution that closed 738 was to use libffi)

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: libffi was:Re: ANNOUNCE: GHC version 7.0.1

2010-11-17 Thread Tyson Whitehead

On November 17, 2010 09:34:02 Christian Maeder wrote:
> ghc can be built without and with libffi. What advantage do I gain in
> the latter case? The packages that come with ghc (displayed by "ghc-pkg
> dump") don't use it.

I can't find this right now, but, from when I last looked through the code, I 
seem to recall something about libffi helping you not upset selinux.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Parallel Haskell: 2-year project to push real world use

2010-05-08 Thread Tyson Whitehead

On May 3, 2010 08:04:14 Johan Tibell wrote:
> On Mon, May 3, 2010 at 11:12 AM, Simon Peyton-Jones
> wrote:
> > In truth, nested data parallelism has taken longer than we'd hoped to be
> > ready for abuse :-).   We have not lost enthusiasm though -- Manual,
> > Roman, Gabi, Ben, and I talk on the phone each week about it.  I think
> > we'll have something usable by the end of the summer.
>
> That's very encouraging! I think people (me included) have gotten the
> impression that the project ran into problems so challenging that it
> stalled. Perhaps a small status update once in a while would give people a
> better idea of what's going on. :)

Most will likely see this on Planet Haskell, but it seem worthwhile mentioning 
as there seems to be quite a bit of interest in the technology.  Manuel just 
finished posting quite a nice talk given by Simon PJ recently in Boston

http://pls.posterous.com/simon-peyton-jones-on-data-parallel-haskell

Related to some of the questions asked at the talk, I would be curious to hear 
any comments regarding adding support for processor level SIMD vectorization 
(e.g., the SSE{1,2,3} instructions on the x86) in conjunction with NDPH.  It 
would seem conceptually as simple as having "ideal" vector primitive 
operations and coding the basic operations (e.g., sumS) in terms of those.

This would then presumably have a very positive impact on the the existing 
Vector, Repa, etc, libraries as well, and would seem quite a bit easier than 
trying to recognize opportunities for these instructions via loop 
analysis/whatever and insert them after the fact?  Or is the recognition 
option felt to be "better" (i.e., in the sense it would also be applicable in 
other situations as well) and easily done with the big data-flow hammer?

I see there is a track ticket regarding SIMD instructions

http://hackage.haskell.org/trac/ghc/ticket/3557

so I'm guessing at least part/all of the answer is just the time to do the 
grunt work to add in the support to CMM and the native code generators.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Parallel Haskell: 2-year project to push real world use

2010-04-30 Thread Tyson Whitehead

On April 30, 2010 06:32:55 Duncan Coutts wrote:
> In the last few years GHC has gained impressive support for parallel
> programming on commodity multi-core systems. In addition to traditional
> threads and shared variables, it supports pure parallelism, software
> transactional memory (STM), and data parallelism. With much of this
> research and development complete, and more on the way, the next stage
> is to get the technology into more widespread use.

Does this mean DPH is ready for abuse?

The wiki page sounds pretty tentative, but it looks like it's been awhile 
since it's been updated.

http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell

Thanks!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Literate files in the GHC source

2010-04-17 Thread Tyson Whitehead

I'm just wondering how exactly are the literate files in the GHC source suppose 
to be treated.  I would like to read some of the documentation.

The presence of various latex commands such as subsection strongly suggests 
they are done in latex.  Running pdflatex, however, doesn't work.  Nor does 
created a wrapper latex file to define the document class, the code 
environment, 
etc., as it then bombs out on such things as non-escaped '#'s.

I searched around on the Internet and mailing list archives didn't seem to find 
anything.  If I had to guess, I would suspect it started as actually latex 
documentation, but has fallen into a state of disrepair.  Is this correct?

Thanks!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: "static_wrapper" imports in the FFI

2010-03-17 Thread Tyson Whitehead

On March 16, 2010 20:01:32 Iavor Diatchki wrote:
> Optionally disabling executable heap blocks would be a separate patch.
>  As far as I know, the only reason that the heap is executable is to
> support the adjustor thunks used to implement "wrapper" imports.  The
> "static_wrapper" patch provides a way to install Haskell callbacks in
> many C libraries without the need for adjustor thunks.

I believe this is the code in "rts/Adjustor.c" and "rts/sm/Storage.c".  It (or 
it gets ffi to) write a small bit of assembler that adds a hard coded pointer 
(to a StablePtr) to the argument list and jump to a hard coded address.  It 
then has to fiddle with the executable bits on the memory page it wrote the 
code into in order to allow the system the execute it.

This leaves me to ask though, could you not also tighten up the security here 
by just getting the the system to turn off the writable bit when it also turns 
on the executable one?  I realize this implies that you will only get one of 
these per page, but still that might not be that bad if  you don't generate 
very many and recycle them.

As a compromise, you could also just temporarily make pages writable when you 
add to them, thus greatly minimizing the attack window.  If you could get the 
OS could freeze all other threads while doing this there would be no window.   
If there generation and usage is/could be localized to OS threads, then 
modification would always be safe if OS thread works on their own page.

I scanned the ghc source (all c, h, cmm, hs, and lhs files), and the only usage 
of import "wrappers" seems to be in System.Console.Terminfo.Base.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Feedback request: priority queues in containers

2010-03-16 Thread Tyson Whitehead

On March 16, 2010 13:24:32 k...@cas.mcmaster.ca wrote:
> Just an aside (and shameless plug ;-): Since the signatures overlap so
> much, it is in fact easy to wrap these modules into instances
> for many possible different type classes

Yes.  I can see the preferred way would be to put the classes on top of the 
implementation, instead of the implementations on top of the classes.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Feedback request: priority queues in containers

2010-03-16 Thread Tyson Whitehead

On March 16, 2010 09:29:06 Louis Wasserman wrote:
> I'd like to request some more feedback on the
> proposedimplementation
> for priority queues in containers.  Mostly, I feel like
> adding a new module to containers should be contentious, and there hasn't
> been as much griping or contention as I expected.  The silence is feeling
> kind of eerie!

Not sure if this is an appropriate question at all as I haven't looked at the 
code, but would it be possible to put any primary functionality into a class.

I'm thinking something along the lines of how the vector code works.  This 
allows you to use all the higher-order functions (i.e., those implemented 
using the primary functions) on a different underlying implementation.

I've found this particularly useful in wrapping Perl data types.  For the Perl 
array, all I had to do was write an class instance for the vector module, and 
I have all these higher-order functions I could use from existing code.

It would be very nice to have had something similar to do for the hash tables.  
Even to just provide a "native haskell" immutable look into the data so 
Haskell code can extract the components it needs with standard functions.

Cheers!  -Tyson

PS:  I'm still working on the wrapping, so I might change my mind as to how 
useful this really is, but thought I should throw it out there.

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: "static_wrapper" imports in the FFI

2010-03-15 Thread Tyson Whitehead

On March 15, 2010 19:54:52 Iavor Diatchki wrote:
> In terms of notation, I like the directness of the "static_wrapper"
> declaration (although not so much the "static_wrapper" name!) because
> it avoids duplication, thus reducing clutter and potential errors.

If it does get accepted, I would propose that adding a new closure form to the 
existing import and export forms might look a bit nicer.  As in

 foreign closure ccall "cname" haskellname :: type

gives you a C function (as you said) of the form

 result cname(arguments ..., HsStablePointer haskellname);

This gets around the non-obviousness of an import also doing an export.

If you didn't want to put the closure in the type, you could also support 
"first cname" and "last cname" (the default) versions to specify where the 
closure should be passed.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: "static_wrapper" imports in the FFI

2010-03-15 Thread Tyson Whitehead

On March 15, 2010 12:48:02 Iavor Diatchki wrote:
> I have implemented a small extension to the FFI which allows for
> "static_wrapper" imports.  These are a variation on "wrapper" imports
> that do not use run-time code generation.  This is important in
> security sensitive contexts because it avoids executable data.  While
> "static_wrapper" imports are less general then "wrapper" imports, they
> can be used to install Haskell handlers in many C libraries where
> callbacks have an extra "user-data" parameter (e.g., GTK signal
> handlers).

Hi Iavor,

Would not the following also do what you want

 foreign export ccall "haskellCIntCInt" \
   cbind :: CInt -> StablePtr (CInt -> IO CInt) -> IO CInt

 cbind :: a -> StablePtr (a-> IO b) -> IO b
 cbind x f = deRefStablePtr f >>= (\f_ -> f_ x)

On the C side you would then have something like

 register_callback(haskellCIntInt,)

where  would be a stable pointer of type StablePtr 
(CInt -> IO CInt) generated on the haskell side via

 newStablePtr 

and passed to the C code.

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

(read getLine) + (read getLine)

2010-03-04 Thread Tyson Whitehead

Consider what would happen if GHC optionally inserted a morphism on fexps (the 
expression at the head of an application) as it currently does with integer 
literals and optionally with string literals.  A standard function call

  f x y 

would windup as

  morph (morph f x) y

With the following basic classes

  -- Functor
  class F t where
  func :: (a -> b) -> t a -> t b

  -- Pointed
  class F t => P t where --
  pure :: a -> t a

  -- Applicative
  class P t => A t where
  apply :: t (a -> b) -> t a -> t b

  -- Monad
  class A t => M t where
  join :: t (t a) -> t a

we can define a standard set of morphisms

  -- Category
  class C a b where
  morph :: a -> b

  instance C (a -> b) (a -> b) where
  morph = id

  instance F t => C (a -> b) (t a -> t b) where
  morph = func

  instance P t => C (a -> b) (a -> t b) where
  morph f = func f . pure

  instance A t => C (t (a -> b)) (t a -> t b) where
  morph = apply

  instance A t => C (t (a -> b)) (a -> t b) where
  morph f = apply f . pure

  instance M t => C (t (a -> t b)) (t a -> t b) where
  morph f = join . apply f

  instance M t => C (t (a -> t b)) (a -> t b) where
  morph f = join . apply f . pure

  instance M t => C (a -> t b) (t a -> t b) where
  morph f = join . apply (pure f)

  instance M t => C (a -> t b) (a -> t b) where
  morph f = join . apply (pure f) . pure

If GHC could correctly pick between them (possibly if just tried them from 
most specific to least?), then we would get the following transformation

  (read getLine) + (read getLine)

  --> morph (morph (+) (morph read getLine)) (morph read getLine)

  --> apply (func (+) (func read getLine)) (func read getLine)

or, using the names in the standard library (include Control.Applicative),

  (<*>) (fmap (+) (fmap read getLine)) (fmap read getLine)

which does the right thing.

Anyway, thought I would throw it out there in case it could be made to work.

Cheers!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Issue with type families

2010-03-04 Thread Tyson Whitehead

On March 3, 2010 21:10:56 Daniel Fischer wrote:
> Well, GHC takes only the class head into account for instance selection,
> and
>
> u -> (v -> w)
>
> matches both,
>
> a -> b   --  (a == u, b == v -> w)
>
> and
>
> m (c -> d)-- (m == ((->) u), c == v, d == w),
>
> ...  ...
>
> are indeed conflicting, so you can't even use OverlappingInstances etc. to
> make it work.

Thanks very much for the explanation.  I had only read 7.7 (Type families) 
from the GHC manual.  After reading what you wrote and all of 7.6 (Class and 
instances declarations) a couple of times I think I've got it.

As it says in 7.7.2.2.2, "[t]he instance declarations of a type family used in 
a single program may only overlap if the right-hand sides of the overlapping 
instances coincide for the overlapping types."

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Issue with type families

2010-03-03 Thread Tyson Whitehead

On March 3, 2010 18:35:26 Daniel Fischer wrote:
> Because:
>
> instance Applicative ((->) a) -- Defined in Control.Applicative
>
> so, from the instance Z (a -> b), with b == c -> d, we have an
>
> instance Z (a -> (b -> c))
>
> and from instance Z (m (u -> v)), we have, with m == ((->) x), an
>
> instance Z (x -> (u -> v))

Thanks Daniel,

That makes sense.  Strangely enough though, I had actually originally tried it 
with my own Applicative class just in case I was being tripped up by something 
like the (->) instance you pointed out, and it still didn't work.  That is

  {-# LANGUAGE FlexibleInstances, TypeFamilies #-}

  newtype I a = I a

  class A t where
  ap :: t (a -> b) -> t a -> t b

  class Z t where
  type W t
  z :: t -> W t

  instance A I where
  ap (I f) (I x) = I $ f x 

  instance Z (a -> b) where
  type W (a -> b) = a -> b
  z = id

  instance A t => Z (t (a -> b)) where
  type W (t (a -> b)) = t a -> t b
  z = ap

also gives me

  Temp.hs:17:9:
  Conflicting family instance declarations:
type instance W (a -> b) -- Defined at Temp.hs:17:9
type instance W (t (a -> b)) -- Defined at Temp.hs:21:9
  Failed, modules loaded: none.

Is the compiler somehow anticipating that I could add an instance for (->) to 
A and thus be back to the Applicative class situation?

Thanks!  -Tyson

PS:  I asked this here because type classes is a GHC issue, would the haskell-
cafe list been a more appropriate place?

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Issue with type families

2010-03-03 Thread Tyson Whitehead

The following code

  {-# LANGUAGE FlexibleInstances, TypeFamilies #-}

  import Control.Applicative

  class Z t where
  type W t
  z :: t -> W t

  instance Z (a -> b) where
  type W (a -> b) = a -> b
  z = id

  instance Z (IO (a -> b)) where
  type W (IO (a -> b)) = IO a -> IO b
  z = (<*>)

works fine, but if I try and generalize to from IO to the Applicative classes

  instance (Applicative m) => Z (m (a -> b)) where
  type W (m (a -> b)) = m a -> m b
  z = (<*>)

I get the following error

  Temp.hs:10:9:
  Conflicting family instance declarations:
type instance W (a -> b) -- Defined at Temp.hs:10:9
type instance W (m (a -> b)) -- Defined at Temp.hs:14:9
  Failed, modules loaded: none.

unless I remove one of the instances, and then it is happy.

Is this correct?  I don't claim to really understand the rules regarding type 
classes, but I can't see why these are overlapping.

Thanks!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: C that setjmps -> haskell -> C that might longjmp

2010-03-02 Thread Tyson Whitehead

On March 2, 2010 06:17:46 Simon Marlow wrote:
> For now I suggest you use setjmp.  If you want to suggest an API to tell
> the RTS about a longjmp, then perhaps we could implement something, but
> I'm not sure what the API would look like, because you don't have a
> handle to the in-progress calls.  In RTS-speak you need to tell the RTS
> about which Tasks have been terminated.  I've actually been tinkering
> with this code a bit recently so there is now a structure called InCall
> which replaces some of what Task was for, but the idea is similar.

Thanks Simon,

I haven't quite managed to wrap my head around how the os threads, haskell 
threads, tasks, and capabilities all fits together in the rts, so I don't have 
any ideas to suggest for a relevant API.

I'll setjmp wrap them.  I was guessing that was the most likely option.  : )

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

C that setjmps -> haskell -> C that might longjmp

2010-03-01 Thread Tyson Whitehead

If I have the following call sequence

C code -> Haskell code -> various C code bits

where the various C code bits on the right might do a longjmp (their version 
of an exception) and jumping back to the C code on the left.

Is it possible to have C code on the left then somehow tell GHC to cleanup the 
aborted Haskell code/resume executing it with an exception, or is the only 
option to setjmp wrap all the various C code bits on the right?

Thanks!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Some great results on fused code with the LLVM backend

2010-02-25 Thread Tyson Whitehead

On February 21, 2010 20:57:25 Don Stewart wrote:
> I tried out some of the vector and uvector fusion benchmarks with the
> new LLVM backend
>
> http://donsbot.wordpress.com/2010/02/21/smoking-fast-haskell-code-using-ghc
>s-new-llvm-codegen/
>
> and got some great results for the tight loops generated through fusion.
> Up to 2x faster than gcc -O3 in some cases.

I had a quick scan through Davids thesis the other day and noted that he 
attributes a lot/at least some of the tight loops performance advantage to not 
having pinned the STG registers except at function entrance and exit.

http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf

According to what I understand from the bottom of page 42 and top of page 43, 
this was done through a custom calling convention whereby the first N arguments 
get passed in the N registers assigned to the STG virtual registers, and every 
function is extended to take the STG registers as their first N parameters.

The net result is that, on entry to any function (there are only entries to 
worry about as everything is a tail call), the STG virtual registers are in 
the correct hardware registers, so the RTS is happy.

What is interesting though, is LLVM is free to spill them between function 
calls.  This can free up more registers for right loops, and from my 
understanding of the bottom of page 53 and top of page 54, this was likely 
crucial to getting the great tight-loop performance in some cases.

I don't know if this even makes sense to ask, but could the same thing be done 
for the native code generator (i.e., implement global RTS registers as a 
calling convention instead what I presume is a don't touch approach)?

Cheers!  -Tyson

PS:  If you happen to read this list, that was a nice body of work David.

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Shared GHC libraries and the runtime system

2010-02-22 Thread Tyson Whitehead

On February 22, 2010 17:00:25 Max Bolingbroke wrote:
> Hi Tyson,
>
> This blog post
> (/) might help explain the  motivation (actually there are a few relevant
> posts on the well-typed site).
>
> Essentially, I believe that this is done so that you can vary the RTS
> by changing LD_LIBRARY_PATH. I've never used this facility so I'm
> unable to say how useful this actually is (or if it actually works at
> the moment).

Thanks Max.  Those were good write ups.  I'm pleased to report that I've got 
GHC hooked into Perl as a shared library (i.e., can call my GHC code from 
Perl).  I'm  working on trying to get a reasonable build system solution now.

So far I've been trying to build a single shared library, but I thinking the 
easiest/intended way might be to just get GHC to just build its own library 
and then specify it as a required shared library to the Perl stub library.

This way I wouldn't have to mix flags from the build systems in the final link.

(although a shared library of stubs to a shared library seems a bit strange)

Cheers!  -Tyson

PS:  Thanks to everyone responsible for getting shared libraries working.

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Shared GHC libraries and the runtime system

2010-02-22 Thread Tyson Whitehead

I was working on a shared library that loads up the GHC runtime (via hs_init) 
and have been running into a bunch of undefined stg symbols.

A bit of digging and it seems that GHC doesn't embed

 - the dependency libHSrts-ghc6.12.1.so, and
 - the associated rpath /usr/lib/ghc-6.12.1

into shared libraries that it builds.  Thus, if your main executable isn't GHC 
generated (and hence has these), you run into unresolved symbols.

I can work around this by manually adding them myself.  Is there any reason 
GHC can't put this information by default into shared libraries though?

Thanks!  -Tyson

PS:  Further digging into the various shared libraries packaged with GHC 
(Debian GHC package 6.12.1-2) reveal that they are actually missing

 - the dependency libHSrts-ghc6.12.1.so, and
 - all rpaths (i.e., there are absolutely no rpaths in any of them)

$ objdump -p /usr/lib/ghc-6.12.1/base-4.2.0.0/libHSbase-4.2.0.0-ghc6.12.1.so
...
Dynamic Section:
  NEEDED  libHSinteger-gmp-0.2.0.0-ghc6.12.1.so
  NEEDED  libgmp.so.3
  NEEDED  libHSghc-prim-0.2.0.0-ghc6.12.1.so
  NEEDED  libc.so.6
  SONAME  libHSbase-4.2.0.0-ghc6.12.1.so
...



signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

GHC FFI stub files

2010-02-21 Thread Tyson Whitehead

I'm writing a perl module that interfaces some haskell code and its a bit of a 
pain because both perl and ghc have their own custom build system.

Is there a way to get ghc to give you the gcc/as/ld options (library paths, 
include paths, flags, etc) required for the FFI stub files it generates?

The best I've come up with so far is to extract them from GHC by telling it to 
use a "capture the arguments" command as the compiler and linker.

Thanks!  -Tyson

(I could also tell the perl build system that ghc is c compiler, assembler, 
and linker, but then I need to do a bunch of wrapping with -optc and friends)


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Quasi quoting

2010-02-03 Thread Tyson Whitehead

On February 2, 2010 19:48:27 Max Bolingbroke wrote:
> It's a shame that TH is too widely used to be amenable to refactoring
> into this sort of form.

I thought the standard thing to do in this case was to add either a pragma 
that gives you the new behaviour or trigger it with an alternative syntax 
(such as the aforementioned  (|...|)).

Throw in a warning whenever the old is used, and then, after a couple of 
years/releases, you can depreciate support for it guilt free.  : )

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Question regarding combinator call convention

2010-02-01 Thread Tyson Whitehead

On February 1, 2010 06:43:40 Simon Marlow wrote:
> That's a very old document and is inaccurate in various ways.  The
> Commentary is more up to date:
>
> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts

That's good to know.  Thanks for the link.  : )

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Question regarding combinator call convention

2010-01-29 Thread Tyson Whitehead

I was looking through the code for 6.12.1 and am a bit confused about 11.1.3 
in the runtime system documentation docs/rts/rts.tex.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.79.641&rep=rep1&type=pdf

It says the arguments passing convention to a known combinator differs from 
that of the others in that it starts the arguments with R1 instead of R2 (R1 
points to the associated memory data structure for non-combinators).

The thing I'm confused with is I would expect this would require treating 
combinators special, and yet I don't see an combinator specific stuff (in 
either 
ClosureTypes.h or the generated AutoApply.cmm) to treat them special.

Ive also looking at functions like plusInt and a few of my own, but, apart 
from internal functions like stg_app_*, I'm not finding any code using the the 
combinator passing convention.  Am I missing something?

Thanks!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: GHC core plugins

2010-01-27 Thread Tyson Whitehead

On January 27, 2010 05:44:00 José Pedro Magalhães wrote:
> Alright, ticket created: http://hackage.haskell.org/trac/ghc/ticket/3843
>
> In any case, for now I am willing to hard-code a new core-to-core pass on
> the compiler. Any pointers for where I have to look at?

Just wondering a couple of things about core -> core plugins.

Is the second use case mentioned by pumpkingod there (tracking which rewrite 
rules are firing and when) actually possible using a core -> core plugin?

(it would be great to have this information fed into some sort of emacs plugin 
where one could drill down on specific functions to see what is happening)

Also, is Hoopl expected to be used on core -> core plugins, and, if so, is the 
infrastructure in place for this?

Thanks!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Question regarding the GHC users manual

2010-01-25 Thread Tyson Whitehead

Hi Max and Niklas,

Thank you both for your answers.  I get it now.

I didn't read carefully enough to note that the explicit type on F a b was the 
type of F and the type of F (although, in retrospect, this last interpretation 
wouldn't have worked as we would have need at least * -> * -> *).

Thanks again.

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Question regarding the GHC users manual

2010-01-25 Thread Tyson Whitehead

Hi all,

Just wondering if there could be a typo in section 7.7.2.1 of the GHC manual 
(type family declarations)

http://www.haskell.org/ghc/docs/latest/html/users_guide/type-families.html

Specifically, it says

type family F a b :: * -> *   -- F's arity is 2, 
  -- although its overall kind is * -> * -> * -> *

along with

F Char [Int]   -- OK!  Kind: * -> *
F Char [Int] Bool  -- OK!  Kind: *
F IO Bool  -- WRONG: kind mismatch in the first argument
F Bool -- WRONG: unsaturated application

and I'm wondering about the overall kind.  Shouldn't that be * -> * -> *, or 
am I not understanding something?

Thanks!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Porting to DragonFly BSD

2009-11-16 Thread Tyson Whitehead

On November 15, 2009 07:42:46 Goetz Isenmann wrote:
> 001e5565 <__hscore_get_errno>:
>   1e5565:   55  push   %ebp
>   1e5566:   89 e5   mov%esp,%ebp
>   1e5568:   65 a1 00 00 00 00   mov%gs:0x0,%eax
>   1e556e:   8b 15 00 00 00 00   mov0x0,%edx
>   1e5574:   8b 04 10mov(%eax,%edx,1),%eax
>   1e5577:   c9  leave
>   1e5578:   c3  ret
>
> The corresponding relocation info is
>
> 001e5570 R_386_TLS_IE  errno
>
> > After some googling, and learning very little, I speclated that simply
> > adding a third line for R_386_TLS_IE with the same action as the first
> > line - just storing value, might work:
> >
> > case R_386_TLS_IE: *pP = value; break;
> >
> > This seems to work up to a point. That is I get no crashes in ghci.

According to "ELF Handling For Thread-Local Storage", R_386_TLS_IE "resolves 
to the absolute address of the GOT [global offset table] slot" that is assigned 
to the TLS (thread local storage) symbol.

http://people.redhat.com/drepper/tls.pdf   (middle of page 37)

The GOT slot is assigned a R_386_TLS_TPOFF relocation for the symbol.

So, what you are suppose to get is:

1- While linking, the linker resolves the R_386_TLS_IE relocation by updating 
the instruction with the absolute address of the corresponding GOT slot.

2- While loading, the dynamic linker resolves the R_386_TLS_TPOFF relocation 
by updating the GOT slot to contain the offset of the symbol relative to the 
TLS pointer (which is stored at %gs:0x0).

3- The program then computes the address of the TLS variable by loading the 
TLS pointer (the "mov %gs:0x0,%eax" instruction) and adding to it the symbols 
TLS offset as stored in the GOT table (the updated "mov 0x0,%edx" instruction).

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Porting to DragonFly BSD

2009-11-16 Thread Tyson Whitehead

On November 15, 2009 07:42:46 Goetz Isenmann wrote:
> I am still not sure, but the generated code might be ok
>
> 001e5565 <__hscore_get_errno>:
>   1e5565:   55  push   %ebp
>   1e5566:   89 e5   mov%esp,%ebp
>   1e5568:   65 a1 00 00 00 00   mov%gs:0x0,%eax
>   1e556e:   8b 15 00 00 00 00   mov0x0,%edx
>   1e5574:   8b 04 10mov(%eax,%edx,1),%eax
>   1e5577:   c9  leave
>   1e5578:   c3  ret
>
> The corresponding relocation info is
>
> 001e5570 R_386_TLS_IE  errno

Not sure about DragonFly, but under Linux that would be the TLS initial-
executable model .  It is assuming that the TLS memory has already been 
allocated and is at the location pointed to by the word at %gs:0x0.

The default when working with position-independent code is the dynamic model.  
It has to go through ___tls_get_addr in order to give the dynamic loader a 
chance to allocate the thread local storage memory if this is the first access

The GNU variant of the code for this model is

0x00 leal x...@tlsgd(,%ebx,1),%eax
0x07 call ___tls_get_a...@plt

It returns the address in eax.  The two relocation are R_386_TLS_GD to load 
eax to point to the GOT entry for the desired symbol and R_386_PLT32 for the 
call ___tls_get_addr.

For more details on the four different modes (each one is a successive 
optimization of the general dynamic model for cases where the symbol is in the 
some shared object and/or the TLS memory is know to be allocated)

http://people.redhat.com/drepper/tls.pdf

(pages 19, 27, 36, and 44 give the code for the fully generic and each 
combination of the optimizations respectively).

Under gcc you can specify the model via the "-ftls-model=" option or on 
a per-identifier basis with __attribute__((tls_model(""))).  Besides 
changing what code is emitted, it sets the STATIC_TLS flag in the dynamic 
section to tell force allocation at thread creation if required.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Porting to DragonFly BSD

2009-11-09 Thread Tyson Whitehead

On November 9, 2009 12:39:25 Goetz Isenmann wrote:
> On Mon, Nov 09, 2009 at 10:44:47AM +, Simon Marlow wrote:
> > We should get these patches into GHC.  Most of them look
> > straightforward, but I'm slightly suspicious of this:
> >
> > -STANDARD_OPTS += -DCOMPILING_RTS
> > +STANDARD_OPTS += -DCOMPILING_RTS -D_POSIX_VERSION=199309
> >
> > why was that necessary?
>
> Good that you ask. Can't remember why it was necessary back in april
> to build ghc-6.10.2 on dragonfly-2.2.1. Cannot reproduce the problem
> today with ghc-6.10.4 on dragonfly-2.4.1.
>
> I am glad, that we do not need this define any more. I think it was a
> ugly workaround at best.

I'm not sure about dragonfly, but there are a bunch of similar options for 
glibc under Linux.  For example, "man feature_test_macros" says you need to 
define _POSIX_C_SOURCE to 199309 or greater to use POSIX.1b functionality.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Lazy computation being repeated?

2009-05-27 Thread Tyson Whitehead

I have a program that seems to me to be exhibiting strange behaviour.  
Basically, it looks like something somewhere is not being updated, and this is 
resulting in (very expensive) computation being repeated.

The program in question goes like so.  The results from the expensive 
calculation are stored in this structure

 data Cache = Cache { cacheBits0 :: UArr Float,
  cacheBits1 :: UArr Float }

and computed by this function (which takes several gigs of RAM)

 cacheIndex :: ... -> Cache
 cacheIndex ... = ...

The main program is

  main = catch prog error
  where prog = do ...
  let !cache = cacheIndex ...
  ...
  run cache ...
...

The run function reads an input line from stdin and calls effectChange to 
performs the indicated calculation

  run cache ... = loop
  where loop = do ...
  case effectChange ... cache of
...
  ...
  loop

The effectChange function is calculates the change in an auto correlation.  It 
is made much cheaper (order too fast to notice) through the use of a the one 
time computations stored in cacheBits0 and cacheBits1 (order several seconds).

Whether cacheBits1 is used or not depends on the input, while cacheBits0 is 
always used because I used lengthU cacheBits0 [the fast one from UArr] to get 
the length of both of them.  The first line of input causes a several second 
delay and lots of memory to be allocated (the cacheBits are being computed).

Repeated lines of input respond very quickly and the memory stays mostly 
constant, unless cacheBits1 is used, in which case there is always a delay of 
several seconds and big jumps in memory allocation.

The only thing I can think of that each time effectsChange is being run and it 
is uses cacheBits1, it causes it value to be recomputed.  Indeed, the delay 
and big memory jumps go away if I change all references of cacheBits1 to 
cacheBits0 in effectsChange or make the data fields strict like so

 data Cache = Cache { cacheBits0 :: !(UArr Float),
  cacheBits1 :: !(UArr Float) }

Does this make sense?  Am I missing some reason why this should be the 
expected behaviour for this program?

Thanks!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Closure elimination transformation (continuation style passing code)

2009-05-20 Thread Tyson Whitehead

Thanks for all the feedback guys,

I already find it pretty amazing how well simple stuff expressed in a higher 
level manner compiles down to something decent, and it seems like the future 
is only going to get brighter.  I can hardly wait...  : )

In the meantime, I'll go back to trying to find the right balance between 
expressiveness and something GHC can turn into lean mean code.

Thanks again!  -Tyson

PS:  Sounds like Boquist's thesis would be an interesting read.  I've seen 
references to it come up a couple of times now.


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Closure elimination transformation (continuation style passingcode)

2009-05-20 Thread Tyson Whitehead

On May 20, 2009 05:50:54 Claus Reinke wrote:
> I'm not at all sure what you're aiming for (the above doesn't compile),

Yeah.  The newtype Step a b was required to break the recursive types, and I 
dropped it when performing the various transformations, so they don't type 
check.  Here it is again with the newtypes so each bit can be compiled.


0- initial code

  newtype Step a b = Step { unStep :: (a -> Step a b -> b) -> b -> b }

  iter :: [a] -> (a -> Step a b -> b) -> b -> b
  iter [] next done = done
  iter (x:xs) next done = next x (Step $ iter xs)

  count :: Int -> Char -> Step Char Int -> Int
  count i x step = unStep step (count $ i+1) (i+1)

  test :: String -> Int
  test xs = iter xs (count 0) 0


1a- avoid forming the (count _) closures by passing the function and the 
argument instead of the function bound to the argument

  newtype Step a c b = Step { unStep :: (c -> a -> Step a c b -> b) -> c -> b 
-> b }

  iter :: [a] -> (c -> a -> Step a c b -> b) -> c -> b -> b
  iter [] next i done = done
  iter (x:xs) next i done = next i x (Step $ iter xs)

  count :: Int -> Char -> Step Char Int Int -> Int
  count i x step = unStep step count (i+1) (i+1)

  test :: String -> Int
  test xs = iter xs count 0 0


1b- avoid forming the (iter _) closure by passing the function and the 
argument instead of the function bound to the argument

  newtype Step a c b = Step { unStep :: [a] -> (c -> a -> Step a c b -> [a] -> 
b) -> c -> b -> b }

  iter :: [a] -> (c -> a -> Step a c b -> [a] -> b) -> c -> b -> b
  iter [] next i done = done
  iter (x:xs) next i done = next i x (Step iter) xs

  count :: Int -> Char -> Step Char Int Int -> String -> Int
  count i x step xs = unStep step xs count (i+1) (i+1)

  test :: String -> Int
  test xs = iter xs count 0 0


2- specialize iter for next = count

  newtype Step a c b = Step { unStep :: [a] -> (c -> a -> Step a c b -> [a] -> 
b) -> c -> b -> b }

  iter  :: [a]-> (c -> a -> Step a c b -> [a] -> b) -> c   -> b   -> b
  iter' :: String   -> Int -> Int -> Int
  iter  [] next i done = done
  iter  (x:xs) next i done = next  i x (Step iter) xs
  iter' []  i done = done
  iter' (x:xs)  i done = count i x (Step iter) xs

  count :: Int -> Char -> Step Char Int Int -> String -> Int
  count i x step xs = unStep step xs count (i+1) (i+1)

  test :: String -> Int
  test xs = iter' xs 0 0


3- specialize count for step = iter (and use the specialized iter')

  newtype Step a c b = Step { unStep :: [a] -> (c -> a -> Step a c b -> [a] -> 
b) -> c -> b -> b }

  iter  :: [a]-> (c -> a -> Step a c b -> [a] -> b) -> c   -> b   -> b
  iter' :: String   -> Int -> Int -> Int
  iter  [] next i done = done
  iter  (x:xs) next i done = next   i x (Step iter) xs
  iter' []  i done = done
  iter' (x:xs)  i done = count' i x xs

  count  :: Int -> Char -> Step Char Int Int -> String -> Int
  count' :: Int -> Char  -> String -> Int
  count  i x step xs = unStep step xs count (i+1) (i+1)
  count' i x  xs = iter'   xs   (i+1) (i+1)

  test :: String -> Int
  test xs = iter' xs 0 0

(iter and count are no longer used and can be dropped at this point)


4- inline count'

  iter' :: String -> Int -> Int -> Int
  iter' [] i done = done
  iter' (x:xs) i done = iter' xs (i+1) (i+1)

  count' :: Int -> Char -> String -> Int
  count' i x xs = iter' xs (i+1) (i+1)

  test :: String -> Int
  test xs = iter' xs 0 0

(count' is no longer used and can be dropped at this point)


5- eliminate the done argument as it is always the same as the i argument

  iter'' :: String -> Int -> Int
  iter'' [] i = i
  iter'' (x:xs) i = iter'' xs (i+1)

  test :: String -> Int
  test xs = iter'' xs 0


Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Closure elimination transformation (continuation style passing code)

2009-05-19 Thread Tyson Whitehead

On May 19, 2009 22:17:39 Tyson Whitehead wrote:
> 2- specialize count for step = iter
>
>  
>
> 3- specializing iter for next = count
>
>  
>

I changed this just before sending it and managed to goof step two and three 
(the specializations).  The whole thing, with the correct steps two and three 
should have been the following

1- avoid forming the (iter xs) and (count i+1) closures by passing the 
function and the arguments instead of the function bound to the argument

  iter [] next i done = done
  iter (x:xs) next i done = next i x iter xs

  count i x step xs = step xs count (i+1) (i+1)

  test xs = iter xs count 0 0

2- specialize iter for next = count

  iter  [] next i done = done
  iter  (x:xs) next i done = next  i x iter xs
  iter' []  i done = done
  iter' (x:xs)  i done = count i x iter xs

  count i x step xs = step xs count (i+1) (i+1)

  test xs = iter' xs 0 0

3- specialize count for step = iter (and use the specialized iter')

  iter  [] next i done = done
  iter  (x:xs) next i done = next i x iter xs
  iter' []  i done = done
  iter' (x:xs)  i done = count' i x xs

  count  i x step xs = step  xs count (i+1) (i+1)
  count' i x  xs = iter' xs   (i+1) (i+1)

  test xs = iter' xs 0 0

(iter and count are no longer used and can be dropped at this point)

4- inline count'

  iter' [] i done = done
  iter' (x:xs) i done = iter' xs (i+1) (i+1)

  count' i x xs = iter' xs (i+1) (i+1)

  test xs = iter' xs 0 0

(count is no longer used and can be dropped at this point)

5- eliminate the done argument as it is always the same as the i argument

  iter'' [] i = i
  iter'' (x:xs) i = iter'' xs (i+1)

  test xs = iter'' xs 0

As the first one does not seem to be being performed, I did it manually to see 
if that would be enough that GHC would pickup on the rest.  It seems that this 
isn't the case.  The second two are not being done as well.

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Closure elimination transformation (continuation style passing code)

2009-05-19 Thread Tyson Whitehead

I was wondering about GHC's ability to optimize the following code

  module Test (test) where

  newtype Step a b = Step { unStep :: (a -> Step a b -> b) -> b -> b }

  iter :: [a] -> (a -> Step a b -> b) -> b -> b
  iter [] next done = done
  iter (x:xs) next done = next x (Step $ iter xs)

  count :: Int -> Char -> Step Char Int -> Int
  count i x step = unStep step (count $ i+1) (i+1)

  test :: String -> Int
  test xs = iter xs (count 0) 0

(test implements the string length function).

The transformations steps that reduce this to the optimized version are

1- avoid forming the (iter xs) and (count i+1) closures by passing the 
function and the arguments instead of the function bound to the argument

  iter [] next i done = done
  iter (x:xs) next i done = next i x iter xs

  count i x step xs = step xs count (i+1) (i+1)

  test xs = iter xs count 0 0

2- specialize count for step = iter

  iter [] next i done = done
  iter (x:xs) next i done = next i x iter xs

  count i x xs = iter xs count (i+1) (i+1)

  test xs = iter xs count 0 0

3- specializing iter for next = count

  iter [] i done = done
  iter (x:xs) i done = count i x iter xs

  count i x xs = iter xs (i+1) (i+1)

  test xs = iter xs 0 0

4- inline count

  iter [] i done = done
  iter (x:xs) i done = iter xs (i+1) (i+1)

  test xs = iter xs 0 0

5- eliminate the done argument as it is always the same as the i argument

  iter [] i = i
  iter (x:xs) i = iter xs (i+1)

  test xs = iter xs 0

Currently 6.10.1 with -O2 seems stuck with regard to step 1 (eliminating the 
closures).  Is there any hope of getting it to these transformations?

Thanks!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: No "last core parallel slowdown" on OS X

2009-04-21 Thread Tyson Whitehead

On April 21, 2009 04:39:40 Simon Marlow wrote:
> > These ratios match up like physical constants, or at least invariants of
> > my Haskell implementation. However, the user time is constant on OS X, so
> > these ratios reflect the actual parallel speedup on OS X. The user time
> > climbs steadily on Linux, significantly diluting the parallel speedup on
> > Linux. Somehow, whatever is going wrong in the interaction between
> > Haskell and Linux is being captured in this increase in user time.
>
> We can't necessarily blame this on Linux: the two machines have
> different hardware.  There could be cache-effects at play, for
> example.

Why not try booting a CD or thumb-drive linux distro (e.g., ubuntu live) on 
your 2.4 GHz Q6600 OS X box and see how things stack up.  It would certainly 
eliminate any questions of hardware differences.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: No "last core parallel slowdown" on OS X

2009-04-19 Thread Tyson Whitehead

On April 18, 2009 16:46:44 Daniel Peebles wrote:
> That looks great! I wonder what about Mac OS leads to such good
> performance...
>
> Now if only we could get a nice x86_64-producing GHC for Mac OS too, I
> could use all my RAM and the extra registers my Mac Pro gives me :)

I was a bit surprised when I read the initial report because

1-  I thought GHC had a hard time with 32bit x86 code due to the integer 
register pressure and hacking around the stack based FPU, and

2-  I though OS X had multithreading performance issues (or at least that is 
what I had read in various reports regarding using it as a server).

This leave me wondering how do the absolute numbers compare?  Could the extra 
overhead due to the various 32bit issues be giving more room for better 
threading performance?  What do you get if you use 32bit GHC with Linux?

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: compilation of pattern-matching?

2009-03-25 Thread Tyson Whitehead

On March 25, 2009 21:38:55 Claus Reinke wrote:
> If you don't have that kind of control, and generate code as if
> there were no hardware-level optimizations, the resulting
> mismatch will manifest in hard-to-predict variations in
> performance, making it difficult to see how much or why
> speed is lost. No fun.

I would think the ultimate for this would be to be able to feed profiling 
information into the compiler.  No messy pragmas all over the code and I would 
guess there would be other benefits as well (e.g., inlining decisions).

Of course it's easy to sit back and suggest things like this -- but I would 
image quite a whole other kettle of fish to build them.  : )

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: compilation of pattern-matching?

2009-03-23 Thread Tyson Whitehead

On March 23, 2009 19:46:27 Claus Reinke wrote:
> My idea was that case branches correspond to conditional jumps
> (though the exact correspondence and optimization has been the
> subject of countless papers). If I loop through a very long list,
> most of the time the test for (:) will succeed, requiring no jump,
> while the test for [] will fail, requiring a jump to the alternative
> branch. So, if GHC's pattern-match compilation is naive, the
> reordering will introduce 2 jumps into the common case of the
> loop where none would be needed, right?

Module Test(test) where

test :: [a] -> Int
test (a:b:c) = 2
test (a:[])  = 1
test []  = 0

gives the following cmm (with GHC 6.10.1 and -O2)

Test_test_entry() {
...
chn:
if (Sp - 8 < SpLim) goto chp; // RTS stack check for space
R1 = R2;
I64[Sp - 8] = sgO_info;   // Argument evaluation return address
Sp = Sp - 8;
if (R1 & 7 != 0) goto chs;// Is argument already evaluated?
jump I64[R1] ();  // No, evaluate it
chp:
R1 = Test_test_closure;   // RTS stack expansion (GC?)
jump stg_gc_fun ();
chs: jump sgO_info ();// Yes, go directly to return address
}

sgO_ret() {
...
chg:
_chh = R1 & 7;// Constructor tag is in lower ptr bits
if (_chh >= 2) goto chi;  // Does the tag indicate (:)?
R1 = Test_lvl2_closure+1; // No, load up closure for 0 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
chi:
R1 = I64[R1 + 14];// Yes, get the tail of (:)
I64[Sp + 0] = sgQ_info;   // Tail evaluation return address
if (R1 & 7 != 0) goto chl;// Is tail already evaluated?
jump I64[R1] ();  // No, evaluate it
chl: jump sgQ_info ();// Yes, go directly to return address
}

sgQ_ret() {
...
cha:
_chb = R1 & 7;// Constructor tag is in lower ptr bits
if (_chb >= 2) goto chc;  // Does the tag indicate (:)?
R1 = Test_lvl1_closure+1; // No, load up closure for 1 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
chc:
R1 = Test_lvl_closure+1;  // Yes, load up closure for 2 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
}

Thus the trip is more like (assuming the first two (:) are already evaluated)

test -> chs (WHNF check -- i.e., first (:) is already evaluated) 
chs  -> sgO
sg0  -> chi (constructor check -- i.e., not [])
chi  -> chl (WHNF check -- i.e., second (:) is already evaluated)
chl  -> sgQ
sgQ  -> chc (constructor check -- i.e., not (a:[]))
chc  -> return

Looking at the assembler, things are a bit better in that the the gotos that 
immediately execute a jump are just replaced with a jump.  For example, the 
assembler for test gives (test -> chs -> sg0 is replaced with test -> sg0)

...
Test_test_info:
.Lchn:
leaq -8(%rbp),%rax// RTS stack check for return address
cmpq %r14,%rax
jb .Lchp
movq %rsi,%rbx
movq $sgO_info,-8(%rbp)   // Argument evaluation return address
addq $-8,%rbp
testq $7,%rbx // Is argument already evaluated?
jne sgO_info  // Yes, go directly to return address
jmp *(%rbx)   // No, evaluate it
.Lchp:
movl $Test_test_closure,%ebx  // RTS stack expansion (GC?)
jmp *-8(%r13)

...
sgO_info:
.Lchg:
movq %rbx,%rax// Constructor tag is in lower ptr bits
andq $7,%rax
cmpq $2,%rax  // Does the tag indicate (:)?
jae .Lchi 
movl $Test_lvl2_closure+1,%ebx// No, load up closure for 0 and return
addq $8,%rbp
jmp *(%rbp)
.Lchi:
movq 14(%rbx),%rbx// Yes, get the tail of (:)
movq $sgQ_info,(%rbp) // Tail evaluation return address
testq $7,%rbx // Is tail already evaluated?
jne sgQ_info  // No, evaluate it
jmp *(%rbx)   // Yes, go directly to return address

...

Thus you actually get

test -> sg0 (WHNF check -- i.e., first (:) is already evaluated) 
sg0  -> chi (constructor check -- i.e., not [])
chi  -> sgQ (WHNF check -- i.e., second (:) is already evaluated)
sgQ  -> chc (constructor check -- i.e., not (a:[]))
chc  -> return

I guess this is a long winded way of saying that the branches are being 
ordered such that the fall though case is not the one that you put first, 
which, if I recall correctly, is somewhat bad as the x86 branch predictor 
guesses a forward branch that hasn't been seen before will fall through.

Perhaps they are being ordered by the constructor tag?

Cheers!  -Tyson

PS:  I reversed GHC's ordering of test, sgO, and sgQ for readability above.  
The test -> sg0 and chi -> sgQ jumps actually go backwards, which is actually 
what you want because, if I re

GHC and Haskell Standard wrt floatRange...

2009-03-09 Thread Tyson Whitehead

I think there might be a difference between C/GHC and the Haskell Standard's 
idea of the range of the exponential.  Specifically, the comment in gcc's 
float.h (i.e., where GHC appears to gets its definition from) says

/* Minimum int x such that FLT_RADIX**(x-1) is a normalized float, emin */

while the Haskell Standard says

"...the lowest and highest values the exponent may assume respectively..."

This results in (GHC 6.10.1 on Debian)

Prelude> (2**128::Float)
Infinity
Prelude> (2**127::Float)
1.7014119e38
Prelude> floatRange (0::Float)
(-125,128)

I can file a bug report/patch if nobody is seeing something I missed here...

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Loop unrolling + fusion ?

2009-03-02 Thread Tyson Whitehead

On March 1, 2009 17:31:13 Max Bolingbroke wrote:
> I am no assembly guru and haven't seen that last form of leaq either,
> but I'm going to guess that:
>
>  leaq(%rsi,%rsi,4), %rax
>
> Says that rax is rsi * ((1 + 1) * 2 ^ 4) = rsi * 32
>
>  leaq0(,%rax,8), %rsi

If I recall correctly, leaq is load effective address (i.e., write the address 
into the destination register instead of the data at the address).

The address form is i(b,o,s) = i+b+o*s.  You have (%rsi,%rsi,4) = %rsi+%rsi*4 
into %rax followed by 0(,%rax,8) = rax*8 into %rsi, ultimately giving %rsi*40 
into %rsi (which is the multiplication you have in the ghc generated loop).

(the restrictions on the address form is that s must be one of 1, 2, 4, or 8)

Interesting discussion by the way.  : )

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Compiler optimizations questions for ghc 6.10...

2009-02-19 Thread Tyson Whitehead

On February 19, 2009 18:20:33 Krasimir Angelov wrote:
> Oh. I looked at the primops.txt.pp for something suspicious but I
> didn't checked the actual implementation of quot. I thought that quot
> calls quotInt# directly. When I use quotInt in the code I can get the
> real idiv assembly instruction. Still the code generated by GHC is
> strange it doesn't throw any exception actually. It just evaluates the
> same expression but with the constant maxBound.
>
> On Thu, Feb 19, 2009 at 11:19 PM, Max Bolingbroke
> >a `quot` b
> >
> > | b == 0 = divZeroError
> > | a == minBound && b == (-1) = overflowError
> > | otherwise  =  a `quotInt` b

I checked the quot one like you said.  Kind of strange all right.

It looks like it is performing the a == minBound check (where a = maxBound-x) 
despite the b == (-1) check (where b = 10) being optimized away.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Compiler optimizations questions for ghc 6.10...

2009-02-18 Thread Tyson Whitehead

On February 18, 2009 12:42:02 Tyson Whitehead wrote:
> On February 18, 2009 04:29:42 Max Bolingbroke wrote:
> > Yes - GHC wants to share the work of (maxBound-x)`div`10 between
> > several partial applications of "digit". This is usually a good idea,
> > but in this case it sucks because it's resulted in a massively
> > increased arity. IMHO GHC should fix this by:
> > * Marking divInt# INLINE in the base library. This would result in
> > your code would just containing uses of quotInt#
> > * Making some operations cheap even if they may fail
> > (PrimOp.primpOpIsCheap should change). Though this might mean that we
> > turn non-terminating programs into terminating ones (such operations
> > get pushed inside lambdas) but this is consistent with our treatment
> > of lambdas generally.
>
> I am guess this is because quot maps directly onto the x86 idiv instruction
> due to both of them truncating towards zero, while div, with its truncation
> towards negative infinity, does not.  Running with -ddump-asm seems to back
> this up as quotInt# compiles down to an idiv instruction and divInt# to a
> call through base_GHCziBase_divIntzh_info.  Unfortunately for me, I always
> seem to instinctively go with div and mod ahead of quot and rem.

I see what you mean about div as it is defined through divMod which is in turn 
defined through quotRem in Prelude.

n `div` d =  q  where (q,_) = divMod n d
divMod n d =  if signum r == negate (signum d) then (q-1, r+d) else qr
where qr@(q,r) = quotRem n d

However, GHC almost seems to be doing its own thing as, as I mentioned above, 
it turns it into a divInt# (which is not in GHC.Prim) in the tidy core, which 
then turns into a call through base_GHCziBase_divIntzh_info in the assembler.

(note that I'm just looking at the source from the haskell.org libraries link)

Cheers!  -Tyson


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Compiler optimizations questions for ghc 6.10...

2009-02-18 Thread Tyson Whitehead

On February 18, 2009 04:29:42 Max Bolingbroke wrote:
> > The part of the code under the first lambda in digit is as follows (I
> > didn't keep the original dump, so the uniques have changed here).  It's
> > the second part of the Int overflow bounds check (i.e., y <=
> > (maxBound-x)`div`10), and, indeed, something you don't want to compute
> > unless the easy check fails.
>
> Yes - GHC wants to share the work of (maxBound-x)`div`10 between
> several partial applications of "digit". This is usually a good idea,
> but in this case it sucks because it's resulted in a massively
> increased arity. IMHO GHC should fix this by:
> * Marking divInt# INLINE in the base library. This would result in
> your code would just containing uses of quotInt#
> * Making some operations cheap even if they may fail
> (PrimOp.primpOpIsCheap should change). Though this might mean that we
> turn non-terminating programs into terminating ones (such operations
> get pushed inside lambdas) but this is consistent with our treatment
> of lambdas generally.

Reading your response got me thinking, "quot and div, I thought I had changed 
everything to quot because earlier I found that GHC was leaving evaluation of 
constant expressions like "maxBound-9`div`10" until runtime if I used div."

Turns out I had missed that one in the second bounds check, and changing it 
from "y <= (maxBound-x)`div`10" to "y <= (maxBound-x)`qout`10" resulted in GHC 
"just doing the right thing".  I presume this is because quot must be either 
marked INLINE and/or primOpIsCheap like you said div should be.

I am guess this is because quot maps directly onto the x86 idiv instruction 
due to both of them truncating towards zero, while div, with its truncation 
towards negative infinity, does not.  Running with -ddump-asm seems to back 
this up as quotInt# compiles down to an idiv instruction and divInt# to a call 
through base_GHCziBase_divIntzh_info.  Unfortunately for me, I always seem to 
instinctively go with div and mod ahead of quot and rem.

> Actually, your divInt# call wouldn't even usually be floated out to
> between two lambdas, but at the time FloatOut runs there is something
> in between the \x lambda and the lambdas from the state monad - the
> monadic bind operator! So FloatOut feels free to move the computation
> for "x" up even though that >>= will go away as soon as we run the
> simplifier. What a disaster!

I would like to try combining the GHC optimizer with a genetic algorithm so 
you could set it to pound away on your core loops for an hour or so to find 
the right sequence of ghc optimization steps to generate the tightest code.

Maybe it could then write out an optimization hint file that regular GHC could 
optionally take in for use alongside the standard rules of thumb to produce 
great code.  All I need is more time and more knowledge.  Maybe someday.  : )

> For me, making digit INLINE fixes all this, but that's probably
> because the context of my code is not the same as yours (I had to
> invent parts of a program to bind the bs, q and n variables).

Sorry about that.  I've put the entire routine at

http://www.sharcnet.ca/~tyson/haskell/Example1.hs

I should have done this in the first place as it allows me to provide the full 
code while also stripping it down in the emails to make them more readable.

> For your
> immediate problem, I would suggest this bit of GHC witchdoctory:
>
>  where digit :: Int -> ParseInt Int Int
>digit !x = do !y <- lift get
>  ( if y <= (maxBound-9)`quot`10 ||
> y <= ghc_hacks
>then let !y' = y*10+x in (lift
> $ put y') >> s2
>else throwError "integer overflow" )
>   where {-# INLINE ghc_hacks #-}
> ghc_hacks = (maxBound-x)`div`10
>
> With luck, this should make the loop-invariant cheap in GHC's eyes,
> preventing it from trying to share it. It works for me - but like I
> say things may differ in your program.

I tried that as well, and am pleased to report that it works great.  I think I 
actually understand what is going on here and will hopefully be able to wield 
it in the future to my advantage.

Generally I've had very little luck with the INLINE hammer when the function 
is at all complex.  GHC always seems to finds a way to thwart my pathetic 
attempts and punish me with even worse code (like then not inlining the monad 
bind operation in the code resulting from the inlining).  : ) 

> If you are interested in trying my script, it's available via Git at
> http://github.com/batterseapower/scripts/blob/da1f24ba16c27e3994aa66f9db352
>ec1102c39d2/ghc-dump-split and is called as "ghc-dump-split ghc-dump-file",
> where ghc-dump-file results from redirection of GHC stdout+stderr to a
> file.

I'll grab a copy of that.  : )

> Hope all that helps,

It sure did.  Thanks very much for all your time!

Re: Compiler optimizations questions for ghc 6.10...

2009-02-17 Thread Tyson Whitehead

On February 17, 2009 19:24:44 Max Bolingbroke wrote:
> 2009/2/17 Tyson Whitehead :
> > (compiled with ghc 6.10 with options -O2 -ddump-simpl)

That should have been -ddump-stranal instead of -ddump-simpl.

> > I was wondering why lvl_s1mF is not being inlined into a_s1Gv in the core
> > at the bottom of this email as that is the only place it is ever
> > referenced.
>
> The relevant GHC code is SimplUtils.preInlineUnconditionally. It looks
> like it dosen't get inlined for two reasons:
> 1) It's not a manifest lambda (it's an application) so inlining inside
> another lambda would change the number of times the FVs of lvl_s1mF
> might occur

I have to confess my ignorance here as my google fu failed and so I still 
don't know what a manifest lambda is (other than not a application).  : )

> 2) I'm not sure if the use-context is considered interesting by GHC
> because the application of the function might be hidden by the cast.
> Not sure about this one.

I was wondering about that, which is why I didn't remove all the cast noise.

> So it looks like the problem stems from digit_s1l3 having arity 1
> rather than arity 3. You could try and force it into a higher arity
> somehow, but I can't say exactly how you might do that without seeing
> the rest of the Core (and in particular the part under the first
> lambda in the definition of digit).

The thing is that the inner lambdas come from inlining that StateT monad 
transformers in a StateT q (StateT Int (ErrorT String Identity)) monad (i.e., 
the first one is the q state -- which works out to an Int -- and the second is 
the Int state).  I guess I could explicitly pass them around, but that would 
seem to defeat the purpose of having StateT.

The actual routines under this implement part of a FSM for (hopefully) 
efficiently extracting an Int from a ByteString (or a uvector UArr -- source 
of the Step data type).  The relevant part of the actual code, which is a bit 
hacked up with ! patterns from my attempts to get better code, is as follows.

type ParseInt q a = StateT q (StateT Int (ErrorT String Identity)) a

next :: q -> Step q Word8
next i | i==n  = Done
   | otherwise = Yield (bs `BS.unsafeIndex` i) (i+1)

wrap :: Monad m => (Word8 -> StateT q m a) -> StateT q m a -> StateT q m a
wrap yield (done::StateT q m a) = loop
where loop :: StateT q m a
  loop = do q <- get
case next q of
  Yield x q' -> put q' >> yield x
  Skipq' -> put q' >> loop
  Done   -> done
s2 :: ParseInt q Int
s2 = wrap yield done
where yield :: Word8 -> StateT q (StateT Int (ErrorT String Identity)) Int
  yield x | x==48 = digit 0
  | x==49 = digit 1
  | x==50 = digit 2
  | x==51 = digit 3
  | x==52 = digit 4
  | x==53 = digit 5
  | x==54 = digit 6
  | x==55 = digit 7
  | x==56 = digit 8
  | x==57 = digit 9
  | otherwise = do !y <- lift get
   return y
  where digit :: Int -> ParseInt q Int
digit !x = do !y <- lift get
  ( if y <= (maxBound-9)`quot`10 || y <= 
(maxBound-x)`div`10
then let !y' = y*10+x in (lift $ put y') 
>> s2
else throwError "integer overflow" )
  done :: ParseInt q Int
  done= do !y <- lift get
   return y

I just finished adding the Parse q Int type to help with email line wrapping.  
As I alluded to in my original email, if I don't have the Int overflow check 
in digit, it is not chosen as the loop breaker, all the StateT stuff is 
compiled away, and you get a really nice efficient assembler loop (which is 
important because the final FSM has to actually chew through GBs of data).

The part of the code under the first lambda in digit is as follows (I didn't 
keep the original dump, so the uniques have changed here).  It's the second 
part of the Int overflow bounds check (i.e., y <= (maxBound-x)`div`10), and, 
indeed, something you don't want to compute unless the easy check fails.

digit_s1lk =
  \ (x_aqR [ALWAYS Just U(L)] :: GHC.Types.Int) ->
case x_aqR
of x_XsQ [ALWAYS Just A] { GHC.Types.I# ipv_s1bD [ALWAYS Just L] ->
let {
  lvl_s1my [ALWAYS Just D(T)] :: GHC.Types.Int
  [Str: DmdType]
  lvl_s1my =
case GHC.Prim.-# 9223372036854775807 ipv_s1bD
of wild2_a1xi [ALWAYS Just L] {
  __DEFAULT ->
case GHC.Base.divInt# wild

Re: Compiler optimizations questions for ghc 6.10...

2009-02-17 Thread Tyson Whitehead

On February 17, 2009 19:24:44 Max Bolingbroke wrote:
> 2009/2/17 Tyson Whitehead :
> > It also seems the extra levels of indirection are defeating the
> > strictness analyzer on eta_s1CN in a_s1Gv as all code branches either
> > directly force it or ultimately pass it to digit_s1l3 as in the included
> > branch.
> >
> > Also, why isn't digit_s1l3 optimized to take its first parameter unboxed?
> >  It is strict in its first argument, and grepping the core shows that it
> > is only ever used like in lvl_s1mF (i.e., passed things like lvl_s1mG).
>
> Yeah, that's weird. I don't know the answer to this. Have you actually
> got to the worker-wrapper stage at the point you copied this core?

Yes.  You are right.  Contrary to what the top of the email said, I created 
that output with -ddump-stranal, and -dshow-passes indicates that the worker-
wrapper stage comes next.  If I dump it (or just the final core with -ddump-
simpl), digit* is entirely replaced with a first-argument-unboxed $wdigit*.

The inner lambdas (i.e., the second and third arguments) remain boxed.

It seems I should have just used the -ddump-simpl output instead of the -
ddump-stranal output.  I had just got thinking the the -ddump-simpl output did 
not include strictness analysis because I didn't see it on a bunch of the code 
(in retrospect, that was because that code was created by later stages)

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Compiler optimizations questions for ghc 6.10...

2009-02-17 Thread Tyson Whitehead

(compiled with ghc 6.10 with options -O2 -ddump-simpl)

I was wondering why lvl_s1mF is not being inlined into a_s1Gv in the core at 
the bottom of this email as that is the only place it is ever referenced.

It also seems the extra levels of indirection are defeating the strictness 
analyzer on eta_s1CN in a_s1Gv as all code branches either directly force it 
or ultimately pass it to digit_s1l3 as in the included branch.

Also, why isn't digit_s1l3 optimized to take its first parameter unboxed?  It 
is strict in its first argument, and grepping the core shows that it is only 
ever used like in lvl_s1mF (i.e., passed things like lvl_s1mG).

Thanks!  -Tyson

PS:  Is there any way to get better control over the loop breaker choice?  For 
a slightly simpler digit function, it is not chosen, and great code is 
produced.  I've tried using INLINE on digit, but that seems to result in the 
monad bind operator not being inlined, which produces even worse code.

.
.
.
letrec {
  lvl_s1mF :: Control.Monad.State.Strict.StateT
GHC.Types.Int
(Control.Monad.State.Strict.StateT
   GHC.Types.Int
   (Control.Monad.Error.ErrorT
  GHC.Base.String Control.Monad.Identity.Identity))
GHC.Types.Int
  [Str: DmdType {s1l3->C(S)}]
  lvl_s1mF = digit_s1l3 lvl_s1mG;
  .
  .
  .
  a_s1Gv :: GHC.Types.Int
-> GHC.Types.Int
-> Control.Monad.Error.ErrorT
 [GHC.Types.Char]
 Control.Monad.Identity.Identity
 ((GHC.Types.Int, GHC.Types.Int), GHC.Types.Int)
  [Arity 2
   Str: DmdType U(L)L {a1hP->U(TTTL) s1ka->U(L)}]
  a_s1Gv =

  a_s1Gv =
\ (eta_a1px [ALWAYS Just U(L)] :: GHC.Types.Int)
  (eta_s1CN [ALWAYS Just L] :: GHC.Types.Int) ->
.
.

  GHC.Bool.True ->
(((lvl_s1mF
   `cast` (... ~
  GHC.Types.Int
  -> Control.Monad.State.Strict.StateT
   GHC.Types.Int
   (Control.Monad.Error.ErrorT
  GHC.Base.String Control.Monad.Identity.Identity)
   (GHC.Types.Int, GHC.Types.Int)))
(GHC.Types.I# (GHC.Prim.+# x_a1tl 1)))
 `cast` (... ~
GHC.Types.Int
-> Control.Monad.Error.ErrorT
 GHC.Base.String
 Control.Monad.Identity.Identity
 ((GHC.Types.Int, GHC.Types.Int), GHC.Types.Int)))
  eta_s1CN

.
.
  .
  .
  .
  digit_s1l3 [ALWAYS LoopBreaker Nothing]
 :: GHC.Types.Int
-> Control.Monad.State.Strict.StateT
 GHC.Types.Int
 (Control.Monad.State.Strict.StateT
GHC.Types.Int
(Control.Monad.Error.ErrorT
   GHC.Base.String
   Control.Monad.Identity.Identity))
 GHC.Types.Int
  [Arity 1
   Str: DmdType U(L)]
  digit_s1l3 =
\ (x_aqR [ALWAYS Just U(L)] ::GHC.Types.Int) ->
  case x_aqR
  of x_XsO [ALWAYS Just A] { GHC.Types.I# ipv_s1bt [ALWAYS Just L] ->
  let {
.
.
.
  } in
  (\ (eta_X1sC [ALWAYS Just L] :: GHC.Types.Int)
 (eta_s1FE [ALWAYS Just U(L)] :: GHC.Types.Int) ->
.
.
. )
  `cast` (... ~
 Control.Monad.State.Strict.StateT
   GHC.Types.Int
   (Control.Monad.State.Strict.StateT
  GHC.Types.Int
  (Control.Monad.Error.ErrorT
 GHC.Base.String Control.Monad.Identity.Identity))
   GHC.Types.Int)
  .
  .
  .
}
.
.
.
lvl_s1mG :: GHC.Types.Int
[Str: DmdType m]
lvl_s1mG = GHC.Types.I# 0

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: ST monad and monad tranformers

2009-02-03 Thread Tyson Whitehead

On February 2, 2009 14:50:10 Reid Barton wrote:
> On Mon, Feb 02, 2009 at 06:03:15PM +0100, Josef Svenningsson wrote:
> >
> > I also needed something like this a while ago so I knocked up a really
> > simple module and put it on hackage:
> > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/STMonadTrans
>
> Warning!  The STMonadTrans package uses State# nonlinearly, and as a
> result, can violate referential transparency:

So, if I understand correctly, the underlying issue is that 

newtype STT s m a = STT (State# s -> m (STTRet s a))
data STTRet s a = STTRet (State# s) a

along with

STT m >>= k = STT $ \st -> 
  do ret <- m st
 case ret of
   STTRet new_st a -> 
   unSTT (k a) new_st

(or my equivalent versions) can multithread the "State #s" token depending on 
how the underlying m monad implements it's bind operator.  As in your example

data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving Show

Leaf a >>= k = k a
Branch l r >>= k = Branch (l >>= k) (r >>= k)

things breaks as multithreading the "State# s" token is a strict no-no because 
it doesn't actually duplicate the underlying real word it represents.  StateT 
doesn't have this problem as it real has a state that would then branch.

I guess then, if you wanted to salvage anything out of this, you would need 
something like a MonadSingleThreaded class that tags single threaded monads 
and is a class requirement for combining with the STT monad.

Is this correct?  And, apart from this, is it correct that ghc optimizations 
can't shuffle code in such a way that things break due to the single threading 
of the state token through the primitive operations such as newMutVar#?

Thanks  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: ST monad and monad tranformers

2009-02-02 Thread Tyson Whitehead

On February 2, 2009 11:26:02 Tyson Whitehead wrote:
> The STT type above is a version of ST like the ReaderT, StateT, etc. types.
>
> newtype STT s m a = STT ( State# s -> m (STTBox s a) )
> data STTBox s a = STTBox {-#UNPACK#-} !(State# s) {-#UNPACK#-} !a
>
> runSTT :: (Monad m) => (forall s. STT s m a) -> m a
> runSTT m = case m of STT m' -> do STTBox _ x <- m' realWorld#
>   return x
>
> instance Monad m => Monad (STT s m) where
> return x = STT $ \s -> return $ STTBox s x
> (STT m) >>= k = STT $ \s -> do STTBox s' x <- m s
>case k x of STT k' -> k' s'

Of course, I forgot the method to actually use state threaded code

stToSTT :: Monad m => ST s a -> STT s m a
stToSTT (ST m) = STT $ \s -> case m s of (# s',x #) -> return $ STTBox s' x

In re-reading my original email, I also thought I might not have been clear 
that I did write the instance methods (MonadCont, etc.), I just didn't include 
them as they would have made the email too lengthy.

Cheers!  -Tyson

PS:  Thanks for all the comments so far.

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

ST monad and monad tranformers

2009-02-02 Thread Tyson Whitehead

I have a situation in which I believe I need a parameterizable version of the 
strict ST monad.  My computation type is "StateT s' (STT s (ErrorT e m)) a" 
(i.e., fails or succeeds and has an internal state involving a state thread).

The STT type above is a version of ST like the ReaderT, StateT, etc. types.

newtype STT s m a = STT ( State# s -> m (STTBox s a) )
data STTBox s a = STTBox {-#UNPACK#-} !(State# s) {-#UNPACK#-} !a

(I'm guessing on the UNPACK paragmas here) with

runSTT :: (Monad m) => (forall s. STT s m a) -> m a
runSTT m = case m of STT m' -> do STTBox _ x <- m' realWorld# 
  return x

(writing this as "runSTT (STT m') = ..." doesn't typecheck with ghc 6.8.2)

instance Monad m => Monad (STT s m) where
return x = STT $ \s -> return $ STTBox s x
(STT m) >>= k = STT $ \s -> do STTBox s' x <- m s
   case k x of STT k' -> k' s'

plus all the assorted instances for Functor, MonadPlus, MonadFix, MonadTrans, 
MonadReader, MonadState, etc.  For example,

instance MonadWriter w m => MonadWriter w (STT s m) where
tell = lift . tell
listen (STT m) = STT $ \s -> do (STTBox s' x,w) <- listen $ m s
return $ STTBox s' (x,w)
pass   (STT m) = STT $ \s -> pass $ do STTBox s' (x,f) <- m s
   return (STTBox s' x,f)

I was looking for any comments, wondering if there is a reason for this not 
existing in the library already, and what I should do in terms of paragmas and 
such for speed?  I see the GHC-ST file has a mix of INLINE and NOINLINE.

http://www.haskell.org/ghc/dist/current/docs/libraries/base/src/GHC-ST.html

In particular, return, >>=, >>, and runST are marked INLINE, but there is a 
"regrettably delicate" comment that goes with the runST method.  Also, what 
about the Functor, MonadPlus, MonadFix, MonadTrans, MonadReader, etc. methods?

Thanks! -Tyson

PS:  I would be happy to provide the whole works to be added to the library if 
it is something that should be there.


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: UVector overallocating for (Word/Int)(8/16/32)

2009-01-30 Thread Tyson Whitehead

On Friday 30 January 2009 04:07:19 you wrote:
> This is in uvector, right?  Then you should probably forward this to the
> maintainer, i.e. Don Stewart .

Done.  Thanks for the follow up.

Cheers!  -Tyson

signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

UVector overallocating for (Word/Int)(8/16/32)

2009-01-29 Thread Tyson Whitehead

I believe the arrays for (Word/Int)(8/16/32) are currently taking eight, four, 
and two times, respectively, as much memory as actually required.  That is,

newMBU n = ST $ \s1# ->
  case sizeBU n (undefined::e) of {I# len#  ->
  case newByteArray# len# s1#   of {(# s2#, marr# #) ->
  (# s2#, MBUArr n marr# #) }}

sizeBU (I# n#) _ = I# (wORD_SCALE n#)
wORD_SCALE   n# = scale# *# n# where I# scale# = SIZEOF_HSWORD

(sizeBU is a class member, but all the instances for (Word/Int)(8/16/32) are 
as given above, and SIZEOF_HSWORD is defined as 8 in MachDeps.h on my x86_64)

which would seems to always allocate memory assuming an underlying alignment 
that is always eight bytes.  It seems like the readWord(8/16/32)Array# 
functions may have once operated that way, but, when I dumped the assembler 
associated with them under ghc 6.8.2 (both native and C), I get

readWord8Array
  leaq 16(%rsi),%rax
  movzbl (%rax,%rdi,1),%ebx
  jmp *(%rbp)

readWord16Array
  leaq 16(%rsi),%rax
  movzwl (%rax,%rdi,2),%ebx
  jmp *(%rbp)

readWord32Array
  leaq 16(%rsi),%rax
  movl (%rax,%rdi,4),%ebx
  jmp *(%rbp)

readWord64Array
  leaq 16(%rsi),%rax
  movq (%rax,%rdi,8),%rbx
  jmp *(%rbp)

which is using alignments of one, two, four, and eight bytes respectively.

I'll attach a patch (which I haven't tested beyond compiling and looking at 
the generated assembler).

Cheers!  -Tyson



--- Data/Array/Vector/Prim/BUArr.hs_	2009-01-29 16:55:55.0 -0500
+++ Data/Array/Vector/Prim/BUArr.hs	2009-01-29 17:10:27.0 -0500
@@ -324,7 +324,7 @@
 (# s2#, () #)}
 
 instance UAE Int where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (iINT_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -416,7 +416,7 @@
 (# s2#, () #)}
 
 instance UAE Word8 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (wORD8_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -439,7 +439,7 @@
 (# s2#, () #)}
 
 instance UAE Word16 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (wORD16_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -462,7 +462,7 @@
 (# s2#, () #)}
 
 instance UAE Word32 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (wORD32_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -485,7 +485,7 @@
 (# s2#, () #)}
 
 instance UAE Word64 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (wORD64_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -508,7 +508,7 @@
 (# s2#, () #)}
 
 instance UAE Int8 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (iNT8_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -531,7 +531,7 @@
 (# s2#, () #)}
 
 instance UAE Int16 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (iNT16_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -554,7 +554,7 @@
 (# s2#, () #)}
 
 instance UAE Int32 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (iNT32_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -577,7 +577,7 @@
 (# s2#, () #)}
 
 instance UAE Int64 where
-  sizeBU (I# n#) _ = I# (wORD_SCALE n#)
+  sizeBU (I# n#) _ = I# (iNT64_SCALE n#)
 
   {-# INLINE indexBU #-}
   indexBU (BUArr (I# s#) n ba#) i@(I# i#) =
@@ -858,16 +858,24 @@
 -- Translation between elements and bytes
 -- Duplicated here from Data.Array.Base to avoid build dependency
 
-cHAR_SCALE, wORD_SCALE, dOUBLE_SCALE, fLOAT_SCALE :: Int# -> Int#
-cHAR_SCALE   n# = scale# *# n# where I# scale# = SIZEOF_HSCHAR
-wORD_SCALE   n# = scale# *# n# where I# scale# = SIZEOF_HSWORD
-dOUBLE_SCALE n# = scale# *# n# where I# scale# = SIZEOF_HSDOUBLE
-fLOAT_SCALE  n# = scale# *# n# where I# scale# = SIZEOF_HSFLOAT
-
-wORD16_SCALE, wORD32_SCALE, wORD64_SCALE :: Int# -> Int#
-wORD16_SCALE n# = scale# *# n# where I# scale# = SIZEOF_WORD16
-wORD32_SCALE n# = scale# *# n# where I# scale# = SIZEOF_WORD32
-wORD64_SCALE n# = scale# *# n# where I# scale# = SIZEOF_WORD64
+cHAR_SCALE, wORD_SCALE, iINT_SCALE, dOUBLE_SCALE, fLOAT_SCALE :: Int# -> Int#
+cHAR_SCALE   n# = scale# *# n# where I# scale# = ALIGNMENT_HSCHAR
+iINT_SCALE   n# = scale# *# n# where I# scale# = ALIGNMENT_HSINT
+wORD_SCALE   n# = scale# *# n# where I# scale# = ALIGNMENT_HSWORD
+dOUBLE_SCALE n# = scale# *# n# where I# scale# = ALIGNMENT_HSDOUBLE
+fLOAT_SCALE  n# = scale# *# n# where I# scale# = ALIGNMENT_HSFLOAT
+
+wORD8_SCALE, wORD16_SCALE, wORD32_SCALE, wORD64_SCALE :: Int# -> Int#
+wORD8_SCALE  n# = scale# *# n# where I# scale# = ALIGNMENT_WORD8
+wORD16_SCALE n# = scale# *# n# where I# scale# = ALIGNMENT_WORD16
+wORD32_SCALE n# = scale# *# n# where I# scale# = ALIGNMENT_WORD32
+wORD64_SCALE n# = scale# *# n# where I# scale#

Questions regarding uvector 0.1.0.3

2009-01-18 Thread Tyson Whitehead

I've been looking at uvector and had some questions (hopefully this is the 
right mailing list for this, if not, sorry, and please direct me properly).

First, is there any design thing deeper than just supporting things like the 
unit and pair arrays for layering UArr on top of the underlying BUArr?

Second, the STG code and the native and via-C assembler generated by ghc 6.8.2 
with -O2 for

test :: UArr Int -> Int -> Int
test = indexU

where

indexU :: UA e => UArr e -> Int -> e
indexU arr n = indexS (streamU arr) n

is a loop that simultaneously increments the offset while decrementing the 
index until it reaches the desired position.

If I understand correctly, this indexU definition was used instead of just 
bring indexU from UArr into scope in order to cancel any possible unstreamU in 
arr, and this is a big part of the whole streamU/unstreamU magic.  It replaces 
what would other require N^2 rules to avoid a bunch of temporaries with one.

I wonder if a rule code be added to change any outstanding "indexS (streamU 
arr) n" into an UArr indexU after streamU/unstreamU fusion?  Such rules for 
each of these type of functions (i.e., ones that aren't of the form "= 
unstreamU . ...") would also not be N^2 and would eliminate temporary streams.

Thanks!  -Tyson

PS:  Nice code by the way.  : )



signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Strictness in data declaration not matched in assembler?

2008-10-19 Thread Tyson Whitehead

On Thursday 16 October 2008 14:34:01 Don Stewart wrote:
> FWIW, I get much nicer code with uvector (which uses type families
> to select monomorphic instances of things, and aggressive inlining, to
> yield much better code in practice). The DPH arrays library uses
> a similar method.

Thanks for the hint Don!  : )

I've had a look at the documentation, and I'll hopefully give it a try soon.

Cheers!  -Tyson

PS:  Is the document still up-to-date with regards to recommending via c?
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Strictness in data declaration not matched in assembler?

2008-10-16 Thread Tyson Whitehead

On Thursday 16 October 2008 07:03:05 Roman Leshchinskiy wrote:
> On 16/10/2008, at 21:34, Simon Peyton-Jones wrote:
> > BUT people who care probably UNPACK their strict fields too, which
> > is even better.  The time you can't do that is for sum types
> >data T = MkT ![Int]
>
> You also can't do it for polymorphic components. I've used code like:
>
>data T a = MkT !a
>
>foo :: T (a,b) -> a
>foo (MkT (x,y)) = x
>
> Here, unpacking doesn't work but foo could still access the components
> of the pair directly.

This is actually the situation I was originally looking at.  I just simplified 
it for the sake of posting readable core and assembler.


Specifically, I was looking at some of the assembler GHC was generating for 
some array code to see if it could do a clean enough job to be used instead of 
C, and was finding this sort of thing because STUArrau is defined as

data STUArray s i a = STUArray !i !i !Int (MutableByteArray# s)

I also seem to recall seeing the same sort of thing in some of the state code 
I was also looking at because STRep is defined as

type STRep s a = State# s -> (# State# s, a #)

(the a being the issue here).


With regard to cost, it is probably not that representative, but a typical 
code path for the toy example I posted earlier goes from

leaq -8(%rbp),%rax
cmpq %r14,%rax
-- not taken jump -- (stack overflow check passed)
movq %rsi,%rbx
movq $sni_info,-8(%rbp)
addq $-8,%rbp
testq $7,%rbx
-- taken jump -- (x argument had already been forced)
movq 7(%rbx),%rbx
movq $snj_info,(%rbp)
testq $7,%rbx
-- taken jump -- (strict constructor argument is already forced)
cmpq $1,7(%rbx)
-- not taken jump -- (first of the case options)
movl $Main_lvl_closure+1,%ebx
addq $8,%rbp
jmp *(%rbp)

to

leaq -8(%rbp),%rax
cmpq %r14,%rax
-- not taken jump -- (stack overflow check passed)
movq %rsi,%rbx
movq $sni_info,-8(%rbp)
addq $-8,%rbp
testq $7,%rbx
-- taken jump -- (x argument has already been forced)
movq 7(%rbx),%rbx
cmpq $1,7(%rbx)
-- not taken jump -- (first of the case options)
movl $Main_lvl_closure+1,%ebx
addq $8,%rbp
jmp *(%rbp)

which is a 22% reduction (18 to 14) in instructions executed in the entire 
function, or a 40% reduction (10 to 6) in instruction executed in the core of 
the function (i.e., after the function's argument is possibly forced).

Cheers!  -Tyson

PS:  Thanks everyone for the very informative and interesting discussion.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Strictness in data declaration not matched in assembler?

2008-10-15 Thread Tyson Whitehead

On Wednesday 15 October 2008 10:48:26 you wrote:
> Strictness does not imply unboxing.
>
> To see why not, think about the fact that unboxing breaks sharing. By
> keeping the pointer-indirection in place, we can share even strict
> fields between related values.

I believe I realize that.  What I was wondering about was the fact that it 
seemed to think the pointer might be to a thunk (instead of constructor 
closure).  Doesn't the strictness flag mean the following assembler would work

sni_info:
movq 7(%rbx),%rbx
movq $snj_info,(%rbp)
jmp snj_info

(which could be cleaned up further by combining it with snj_info) instead of

sni_info:
movq 7(%rbx),%rbx
movq $snj_info,(%rbp)
testq $7,%rbx
jne snj_info
jmp *(%rbx)

(i.e., the whole test if it is a thunk and conditionally evaluate it bit is 
unnecessary due to constructor the strictness flag).

Cheers!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Strictness in data declaration not matched in assembler?

2008-10-15 Thread Tyson Whitehead

Consider the following code

data Data = Data { unData :: !Int }

func :: Data -> Int
func x = case unData x of
   1 -> 2
   _ -> 0

Compiling with GHC 6.8.2 gives the following stg code

Main.func =
\r [x_slg]
case x_slg of tpl_slx {
  Main.Data ipv_slj ->
  case ipv_slj of wild_sly {
GHC.Base.I# ds_slm ->
case ds_slm of ds1_slz {
  __DEFAULT -> Main.lvl1;
  1 -> Main.lvl;
};
  };
};

The native code generator turns it into the following x86_64 assembler

Main_func_info:
leaq -8(%rbp),%rax
cmpq %r14,%rax
jb .LcnU
movq %rsi,%rbx
movq $sni_info,-8(%rbp)
addq $-8,%rbp
testq $7,%rbx
jne sni_info
jmp *(%rbx)
.LcnU:
movl $Main_func_closure,%ebx
jmp *-8(%r13)

sni_info:
movq 7(%rbx),%rbx
movq $snj_info,(%rbp)
testq $7,%rbx
jne snj_info
jmp *(%rbx)

snj_info:
cmpq $1,7(%rbx)
jne .LcnE
movl $Main_lvl_closure+1,%ebx
addq $8,%rbp
jmp *(%rbp)
.LcnE:
movl $Main_lvl1_closure+1,%ebx
addq $8,%rbp
jmp *(%rbp)

It seems to me that the !Int member of the Data constructor is being treated 
like it might be a thunk in sni_info (i.e., the whole "testq $7,%rbx" thing).

Isn't this unnecessary as the "!" strictness flag means the Int argument must 
be forced by the Data constructor before being captured?

Thanks! -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Implicit Function Arguments

2008-06-28 Thread Tyson Whitehead

What do people think about implicit function argument?

That is, named function arguments, possibly tagged with something special like 
a tilde, whereby scoped variables of the same name are automatically passed.

The idea is to avoid the pain of a lots of pass through parameters while make 
it easy to modify one or two of them.  For example, in

f1 :: ~state::State -> Input -> Output
f1 input = f2 input

f2 :: ~state::State -> Input -> Output
f2 input = 

we don't have to specify ~state for f2 because it picks it up automatically 
from ~state being in scope from f1.  To change it, however, we just do

f1 :: ~state::State -> Input -> Output
f1 input = f2 input
  where ~state = 

The advantage over wrapping it in a monad being:

1) it is evident what is being passed around behind the scenes from the type 
signatures, and

2) we avoiding the lifting issue (to compose multiple implicit arguments we 
just specifying them -- assuming their are no name clashes).

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Recursive functions and constant parameter closures (inlining/strictness analyzer question)

2008-06-23 Thread Tyson Whitehead

On Friday 20 June 2008 21:34:14 Dan Doel wrote:
> On Friday 20 June 2008, Max Bolingbroke wrote:
> > Of course, if you have any suggestions for good heuristics based on
> > your benchmarking experience then we would like to hear them! There
> > was some discussion of this in the original ticket,
> > http://hackage.haskell.org/trac/ghc/ticket/888, but when implementing
> > SAT I tried out the suggestions made there without good results
> > (though to be perfectly honest I didn't really understand the
> > motivations behind some of the suggestions made).
>
> However, if I had to pick something out of the air, I'd say this: always do
> SAT when the argument in question is a function. This is actually the
> reason I started doing it in the first place. I was working on sorting
> uvectors, and had what I thought was pretty good performance, and then I
> did SAT on the comparison function, and suddenly my code was running in
> half the time. Going back to my sorting (testing on introsort), SAT for the
> array doesn't seem to do much, but undoing the SAT for the comparison
> function causes heap allocation to balloon, and everything gets at least a
> factor of 2 slower.

I've been wondering if a nice option would be to be able to feed profiler 
information in at compile time and have it override the heuristics.

That way, inlining, specialization, SAT, etc., decisions could be made based 
on how the code actually gets used during a typical run of the program.

Cheers!  -Tyson




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Recursive functions and constant parameter closures (inlining/strictness analyzer question)

2008-05-29 Thread Tyson Whitehead


main = print $ foldl' (+) 0 [1..]

with

foldl' f y xs = foldl' y xs
   where foldl' y [] = y
 foldl' y (x:xs) = foldl' (f y x) xs

runs indefinitely with very little memory consumption, while

foldl' f y [] = y
foldl' f y (x:xs) = foldl' f (f y x) xs

rapidly consumes all the machine's memory and dies.


Running ghc with  -ddump-stranal shows the outer foldl' of the first 
gets inlined into main as a call to the following specialized version of 
the inner foldl':


foldl'_sSY [ALWAYS LoopBreaker Nothing] :: GHC.Num.Integer
  -> [GHC.Num.Integer]
  -> GHC.Num.Integer
[Arity 2
Str: DmdType SS]
foldl'_sSY =
 \ (y_aj7 [ALWAYS Just S] :: GHC.Num.Integer)
   (ds_dQl [ALWAYS Just S] :: [GHC.Num.Integer]) ->
   case ds_dQl of wild_XH [ALWAYS Just A] {
 [] -> y_aj7;
 : x_aja [ALWAYS Just S] xs_ajb [ALWAYS Just S] ->
   foldl'_sSY (GHC.Num.plusInteger y_aj7 x_aja) xs_ajb
   }


Doing the same with the second foldl' shows it to remains non-inlined 
and fully polymorphic:


foldl'_sQN [ALWAYS LoopBreaker Nothing] :: forall t_auW t_av2.
  (t_av2 -> t_auW -> t_av2) -> 
t_av2 -> [t_auW] -> t_av2

[Arity 3
Str: DmdType LLS]
foldl'_sQN =
 \ (@ t_auW)
   (@ t_av2)
   (f_aj0 [ALWAYS Just L] :: t_av2 -> t_auW -> t_av2)
   (y_aj1 [ALWAYS Just L] :: t_av2)
   (ds_dQg [ALWAYS Just S] :: [t_auW]) ->
   case ds_dQg of wild_XK [ALWAYS Just A] {
 [] -> y_aj1;
 : x_aj5 [ALWAYS Just L] xs_aj6 [ALWAYS Just S] ->
   foldl'_sQN @ t_auW @ t_av2 f_aj0 (f_aj0 y_aj1 x_aj5) xs_aj6
   }


Forcing it inline with {-# INLINE foldl' #-} just specialized it:

foldl'_sSS [ALWAYS LoopBreaker Nothing] :: (GHC.Num.Integer
   -> GHC.Num.Integer
   -> GHC.Num.Integer)
  -> GHC.Num.Integer
  -> [GHC.Num.Integer]
  -> GHC.Num.Integer
[Arity 3
Str: DmdType LLS]
foldl'_sSS =
 \ (f_aj0 [ALWAYS Just L] :: GHC.Num.Integer
 -> GHC.Num.Integer
 -> GHC.Num.Integer)
   (y_aj1 [ALWAYS Just L] :: GHC.Num.Integer)
   (ds_dQg [ALWAYS Just S] :: [GHC.Num.Integer]) ->
   case ds_dQg of wild_XI [ALWAYS Just A] {
 [] -> y_aj1;
 : x_aj5 [ALWAYS Just L] xs_aj6 [ALWAYS Just S] ->
   foldl'_sSS f_aj0 (f_aj0 y_aj1 x_aj5) xs_aj6
   }


I thought this was interesting.  Is it to be expected?  Am I right in 
interpreting this to mean it was just too much for the strictness 
analyzer.  I believe the first ultimately produces significantly 
superior code, so should one always write their recursive functions such 
that the constant (functional?) parameters are first captured in a closure?


In that vein, would it be useful if the compiler automatically 
transformed the second into the first?


Thanks!  -Tyson
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

91 matches

Mail list logo