from:"Bardur Arantsson"

Re: [Haskell-cafe] Mystery of an Eq instance

2013-09-21 Thread Bardur Arantsson

On 2013-09-21 06:16, Mike Meyer wrote:
  The single biggest gotcha is that two calculations
 we expect to be equal often aren't. As a result of this, we warn
 people not to do equality comparison on floats.

The Eq instance for Float violates at least one expected law of Eq:

  Prelude let nan = 0/0
  Prelude nan == nan
  False

There was a proposal to change this, but it didn't really go anywhere. See:

   http://permalink.gmane.org/gmane.comp.lang.haskell.libraries/16218

(FWIW, even if the instances cannot be changed/removed, I'd love to see
some sort of explicit opt-in before these dangerous/suprising instances
become available.)

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Mystery of an Eq instance

2013-09-21 Thread Bardur Arantsson

On 2013-09-20 18:31, Brandon Allbery wrote:
[--snip--]
 unless you have a very clever representation that can store
 in terms of some operation like sin(x) or ln(x).)

I may just be hallucinating, but I think this is called describable
numbers, i.e. numbers which can described by some (finite) formula.

Not sure how useful they would be in practice, though :).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Mystery of an Eq instance

2013-09-21 Thread Bardur Arantsson

On 2013-09-21 23:08, Mike Meyer wrote:
 Exactly. The Eq and Ord instances aren't what's broken, at least when
 you're dealing with numbers (NaNs are another story). That there are pairs

According to Haskell NaN *is* a number.

 Eq and Ord are just the messengers.

No. When we declare something an instance of Monad or Applicative (for
example), we expect(*) that thing to obey certain laws. Eq and Ord
instances for Float/Double do *not* obey the expected laws.

Regards,

/b

(*) Alas, in general, the compiler cannot prove these things, so we rely
on assertion or trust.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Lifting strictness to types

2013-08-22 Thread Bardur Arantsson

On 2013-08-22 18:19, Thiago Negri wrote:
 I think Scala has this optional laziness too.

Indeed, but it's _not_ apparent in types (which can be an issue).

Due to the somewhat weird constructor semantics of the JVM it also means
you can have immutable values which start out(!) as null and end up
being non-null.

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Alternative name for return

2013-08-07 Thread Bardur Arantsson

On 2013-08-07 22:38, Joe Quinn wrote:
 On 8/7/2013 11:00 AM, David Thomas wrote:
 twice :: IO () - IO ()
 twice x = x  x

 I would call that evaluating x twice (incidentally creating two
 separate evaluations of one pure action description), but I'd like to
 better see your perspective here.
 
 x is only evaluated once, but /executed/ twice. For IO, that means
 magic. For other types, it means different things. For Identity, twice =
 id!
 

Your point being? x is the same thing regardless of how many times you
run it.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] ANNOUNCE: hdbi-1.0.0 and hdbi-postgresql-1.0.0

2013-07-31 Thread Bardur Arantsson

On 2013-07-31 09:22, Alexey Uimanov wrote:

 Regard parameterized SQL: It might be worth using named parameters (e.g.
 :foo and :bar or something like that) rather than ? as
 placeholders in SQL/prepared SQL. This will make it slightly more
 flexible if you need to provide different SQL strings for different
 databases, but want to reuse the code which does the actual running of
 the SQL. It's also more flexible if you need to repeat parameters -- the
 latter is typical with PostgreSQL if you want to emulate
 update-or-insert in a single SQL statement

 
 Named parameters might be more flexible, but it is need to think hard about
 how to implement this.
 If you are using named parameters you need to pass not just list [SqlValue]
 as parameters,
 but Map Text SqlValue or something. So named parameters will not be
 compatible with unnamed and will need
 separate query parser.
 

The use case I'm thinking of it something like this:

  reportSQL :: DatabaseType - SQL
  reportSQL MySQL = 
 SELECT ... custName = :custName ...
 INTERSECTION
 SELECT ... custName = :custName
 
  reportSQL PostgreSQL = 
 SELECT ... AS cust
   WHERE cust.custName = :custName
 FROM SELECT ... AS foo
   WHERE foo.custName = cust.custName
 

For this fictitious example we imagine that PostgreSQL can handle a
nested query of some particular shape where we need an INTERSECTION
query in MySQL. Obviously this is a made up example, but you get the
idea. The point is that the MySQL query may need to refer to the
:custName parameter multiple times whereas the PostgreSQL one doesn't.
Similarly the positions in the SQL may need to be different.

You perhaps still want to have a way to run both variants using the
exact same code:

   runReport :: DatabaseType - Text - IO whatever
   runReport databaseType customerName = do
 result - runSQLWithParameters (reportSQL databaseType)
   [(custName, customerName)]
 ... do stuff with result ...
 return whatever

Of course this being Haskell you can always use higher-order functions
(e.g. a function DatabaseType - Text - IO QueryResult which
encompasses the runSQLWithParameters *and* reportSQL function, but then
you're mixing up the running of the query with the query itself) for
similar purposes, but I tend to find named parameters of this type to be
quite useful for readability.

As Kirill mentioned, you can also use numbered parameters, but I tend to
like named parameters for readability.

Implementation should be reasonably simple: Replace all :xyz (or
whatever syntax you choose) parameters in input with $n and maintain a
map which tells you which parameter $1, $2, etc. correspond to.

Anyway, this is an issue I've sometimes run across with JDBC (which uses
?) in particular and it can be very annoying. Perhaps the best thing in
Haskell would be to just avoid raw SQL entirely in favor of combinators,
but then you often end up with suboptimal SQL which can't really exploit
all the features of your chosen database. Even so it would be nice to
have a DB interface/library that can hit that sweet spot where you can
write your own SQL but your program won't be too tied to a single DB
backend (modulo the concrete SQL).

 
 Regarding migrations: If you haven't already, please have a look at
 Liquibase (http://www.liquibase.org/documentation/index.html) before
 attempting to implement migrations. The most important attributes of
 Liquibase are:

 
 What I am trying to implement is not a new migration system, but just the
 common interface for
 simple schema actions, here is my in-mind draft:
 
 newtype TableName = TableName Text
 
[--snip--]
 This typeclasses must provide database-independent schema introspection and
 changing.
 Migration system can be anything you want.
 

Ah, OK, I see I just misinterpreted the bit in the package description
about migrations then :).

You might end up having a little trouble reconciling metadata from the
different database backends, but certainly there must be *some* useful
common subset of table/index/etc. metadata :).

 I also have the idea do not throw the exceptions in IO but return  (Either
 SqlError a) from
 all the Connection and Statement methods for safe data processing. What do
 you think about ?
 

I don't think I'm qualified to have an opinion either way, but perhaps


http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors

and particularly

   http://hackage.haskell.org/package/errors

can serve as insipration :).

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] ANNOUNCE: hdbi-1.0.0 and hdbi-postgresql-1.0.0

2013-07-30 Thread Bardur Arantsson

On 2013-07-31 05:45, Alexey Uimanov wrote:
 Hello, haskellers. This is the first release of HDBI (Haskell Database
 Independent interface). It is the fork of HDBC.
 HDBI has some improvements in design, it also has better testing and
 performance (mainly because of using Text instead of String anywhere).
 HDBI designed to be more flexible and correct database interface.
 
 You can find out more information in the documentation.
 http://hackage.haskell.org/package/hdbi
 http://hackage.haskell.org/package/hdbi-postgresql
 
 I am planning to implement MySql and Sqlite drivers as well
 
 https://github.com/s9gf4ult/hdbi
 https://github.com/s9gf4ult/hdbi-postgresql
 Issues and pull requests are welcome, as well as collaboration.
 

Looks interesting.

Just a couple of comments from skimming the documentation. Apologies, if
these are already addressed -- didn't see any mention of it from my
read-through.

Regard parameterized SQL: It might be worth using named parameters (e.g.
:foo and :bar or something like that) rather than ? as
placeholders in SQL/prepared SQL. This will make it slightly more
flexible if you need to provide different SQL strings for different
databases, but want to reuse the code which does the actual running of
the SQL. It's also more flexible if you need to repeat parameters -- the
latter is typical with PostgreSQL if you want to emulate
update-or-insert in a single SQL statement.

Regarding migrations: If you haven't already, please have a look at
Liquibase (http://www.liquibase.org/documentation/index.html) before
attempting to implement migrations. The most important attributes of
Liquibase are:

  a) migrations are *manually* specified deltas as opposed to deltas
 automatically derived from two snapshots of the database. Anything
 based on automatically getting from snapshot A to snapshot B *will*
 break or do undesirable/unpredictable things to data at some point.
  b) It deliberately does NOT provide migrations in a Turing Complete
 language -- IME, if you need TC migrations on a routine basis
 you're already in deep trouble and need to think more about what
 you're doing (at a process or engineering level). If you
 *really* need to, you *can* extend it do custom migrations via
 code, but the barrier to entry is sufficiently high that you'll
 rarely be tempted unless you *really* need it and the migration
 step is likely to be reusable.
  c) the concept of contexts (see the documentation) which allows
 you to vary the migrations across different environments as
 needed. (This is definitely open to abuse, but when you need it
 you *really* need it, IME.)
  d) It can also generate full SQL for changes-about-to-be-applied
 so that you can audit and/or apply them manually -- some devops
 teams need this, others may not care. A minor but important
 detail is that the generated SQL includes all the metadata for
 tracking the application of the migrations themselves.

Liquibase is the only system *I've* ever seen that is even close to
doing migrations the Right Way According To Me(TM). (I've used it since
the 1.x days using the XML format and still haven't come across anything
that can really compete.) Liquibase certainly has flaws(*), but one
should think *really* hard about whether there isn't some existing
migrations system which is good enough before foisting yet another
migrations system on the world since it's highly likely to be at least
as flawed as all the existing systems in one way or another.

(Of course, being implemented in Java, there is one aspect of Liquibase
which is annoying if using it for Haskell projects: It's not very
convenient to auto-apply all migrations at startup time without some
sort of run-this-executable-script-before-running-the-main-program
hack. That, and you need a Java Runtime Environment installed.)

Anyway, just throwing that out there...

Regards,

/b

(*) Sub-par internal consistency checking across combinations of
contexts for changelog files being one of them. There also the fact that
it was until recently XML-based and that the the new syntaxes all have
their own problems. Maybe a Haskell DSL could lead to some real
improvements here...



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Proposal: Non-recursive let

2013-07-23 Thread Bardur Arantsson

On 2013-07-22 17:09, i c wrote:
 Usage of shadowing is generally bad practice. It is error-prone. Hides
 obnoxious bugs like file descriptors leaks.

These claims need to be substantiated, I think.

(Not that I disagree, I just think that asserting this without evidence
isn't going to convince anyone who is of the opposite mindset.)

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Proposal: Non-recursive let

2013-07-23 Thread Bardur Arantsson

On 2013-07-23 21:37, i c wrote:
 let's consider the following:
 
 let fd = Unix.open ...
 let fd = Unix.open ...
 
 At this point one file descriptor cannot be closed. Static analysis will
 have trouble catching these bugs, so do humans.
 Disallowing variable shadowing prevents this.
 The two fd occur in different contexts and should have different names.
 

I think you've misunderstood my challenge.

I'm not talking about examples of either good or bad, but empirical
*evidence* for sample sizes greater than 1.

As in: If there was an article title Is shadowing easier to understand
than explicitly named intermediate variables? with an empirically
supported conclusion, I think everybody would be happy, but I just don't
think we're quite there...

Regards,




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] tangential request...

2013-06-24 Thread Bardur Arantsson

On 06/24/2013 06:18 AM, Mark Lentczner wrote:
 Thanks all, I’ve got what I needed.
 
 Finally, 15% seem to be using horrid bitmap console fonts. _How can you
 stand to look at them?!?!_ (Don't worry, you'll have Plush soon enough...)
 

I realize this is probably a bit tongue-in-cheek, but for my money
nothing has beaten Terminus to date. In fact, few fonts even come close
and most of them are bitmapped too.

(It kind of sucks that it only has a few fixed sizes, but until we all
get 600 dpi displays with paper-like contrast, it's probably staying as
my font of choice. That, or radically improved hinting and and
antialiasing.)

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I wish cabal-dev to travel back in time

2013-05-11 Thread Bardur Arantsson

On 05/11/2013 11:10 AM, Alberto G. Corona wrote:
 Hi Café:
 
 
 
 I created just now an issue in cabal-dev:
 
 
 
 https://github.com/creswick/cabal-dev/issues/101
 
 
 
 When compiling old developments, I wish cabal-dev to install and build
 dependencies that were available at a that time
 
 I don't know if  there are alternatives to solving this issue. I think that
 this is very useful and necessary, specially now when library updates are
 increasingly frequent.
 
 Maybe this problem is already solved and I just don´t know (as is often the
 case, for example when, in a sudden aha moment, I reinvented cabal-install
 months after the release). That is the reason why I tell you about it here
 in order to discuss it.
 

This is an issue that the JVM ecosystem is struggling with as well. I'd
encourage people who are thinking about implementing something to have a
look at

   https://github.com/sbt/adept/wiki/NEScala-Proposal

and especially the linked video.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Backward compatibility

2013-05-03 Thread Bardur Arantsson

On 05/03/2013 06:44 PM, Niklas Hambüchen wrote:
 While I certainly enjoy the discussion, how about addressing one of the
 original problems:
 
 On 02/05/13 13:27, Adrian May wrote:
 I just tried to use Flippi. It broke because of the syntax change so I
 tried WASH. I couldn't even install it because of the syntax change.
 
 I just fixed that in https://github.com/nh2/flippi (note that I have
 never seen this code before nor even known about it).
 
 https://github.com/nh2/flippi/commit/5e2fa93f82b4123d0d5b486209c3b722c4c1313d
 
 Had to delete 5 imports and convert one time value.
 
 Took me around 3 minutes.
 

+1



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Backward compatibility

2013-05-03 Thread Bardur Arantsson

On 05/03/2013 06:49 PM, Edward Kmett wrote:
 Tantamount to a new language to fix a minor detail in a typeclass
 hierarchy? That is just histrionic. *No* language is that stable.
 
 Scala makes dozens of changes like that between *minor* versions, and while
 I hardly hold up their development practices as the best in the industry it
 is still somehow seen as enterprise ready.
 
 -Edward

Indeed. It's all turned into absurd hyperbole at this point.

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How far compilers are allowed to go with optimizations?

2013-02-09 Thread Bardur Arantsson

On 02/09/2013 09:56 AM, Johan Holmquist wrote:

[--snip--]
 It just so happened that the old code triggered some aggressive
 optimization unbeknownst to everyone, **including the original
 developer**, while the new code did not. (This optimization maybe even
 was triggered only on a certain version of the compiler, the one that
 happened to be in use at the time.)
 
 I fear being P some day.
 
 Maybe this is something that would never happen in practice, but how
 to be sure...
 

It's definitely a valid point, but isn't that an argument *for* testing
for preformance regressions rather than *against* compiler optimizations?

Actually, it'd be really nice if you could have statically verifiable
big-O performance assertions in the code. I'm guessing that a lot of
work will have been done in this area. Anyone have any pointers to such
work?

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How far compilers are allowed to go with optimizations?

2013-02-09 Thread Bardur Arantsson

On 02/09/2013 10:50 AM, Johan Holmquist wrote:
 I guess I fall more to the reason about code side of the scale
 rather than testing the code side. Testing seem to induce false
 hopes about finding all defects even to the point where the tester is
 blamed for not finding a bug rather than the developer for introducing
 it.

Oh, I'm definitely also on that side, but you have to do the best you
can with the tools you have :).

 
 [Bardur]
 It's definitely a valid point, but isn't that an argument *for* testing
 for preformance regressions rather than *against* compiler optimizations?
 
 We could test for regressions and pass. Then upgrade to a new version
 of compiler and test would no longer pass. And vice versa.
 Maybe that's your point too. :)
 

Indeed :).

 [Iustin]
 Surely there will be a canary
 period, parallel running of the old and new system, etc.?
 
 Is that common? I have not seen it and I do think my workplace is a
 rather typical one.

I don't know about common, but I've seen it done a few times. However,
it's mostly been in situations where major subsystems have been
rewritten and you _really_ want to make sure things still work as they
should in production.

Sometimes you can get away with just making the new-and-shiny code path
a configure-time option and keeping the old-and-beaten code path. (Tends
to be messy code-wise until you can excise the old code path, but
what're you gonna do?)

 
 Also, would we really want to preserve the old bad code just because
 it happened to trigger some optimization?

These things depend a lot on the situation at hand -- if it's something
99% of your users will hit, then yes, probably... until you can figure
out why the new-and-shiny code *doesn't* get optimized appropriately.

 
 Don't get me wrong, I am all for compiler optimizations and the
 benefits they bring as well as testing. Just highlighting some
 potential downsides.
 

It's all tradeoffs :).

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Ticking time bomb

2013-01-31 Thread Bardur Arantsson

On 01/30/2013 08:27 PM, Edward Z. Yang wrote:
 https://status.heroku.com/incidents/489
 
 Unsigned Hackage packages are a ticking time bomb.
 

Somewhere else that shall not be mentioned, someone posted this link
which points to an interesting solution to this problem:

   http://www.futurealoof.com/posts/nodemodules-in-git.html

It requies a little basic knowledge of the Node Package Manager to
understand. Here's a little summary that should it easier to understand
for people who are not familiar with NodeJS:

The Node Package Manager (npm) is the Node JS equivalent of
cabal-install(*).

When you install a module (think Haskell package off Hackage) using
npm, it installs into a directory called node_modules in the
project's directory instead of installing into a global name space.

When a NodeJS program imports a required module, it is first looked up
in the node_modules directory _before_ looking in the global package
database.

Since modules *are* their source, you can check all of this into the
revision control system of your choice.

It seems to me that cabal install could do something very similar to
solve many of the cabal hell and potential security issues when users
blindly do cabal install.

(*) Yeah, yeah, not a package manager. In practice it's being used as
one, so...



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Can cabal be turned into a package manager?

2012-12-12 Thread Bardur Arantsson

On 12/12/2012 06:01 PM, Janek S. wrote:
 In the recent months there was a lot of dicussion about cabal, dependency 
 hell and alike. After 
 reading some of these discussions there is a question I just have to ask:
 
 Why not create a package manager (like rpm or apt) for Haskell software?
 
 I've been using Linux for years. Software for Linux is mostly written in C 
 and C++. There are 
 thousands of libraries with lots of dependencies and yet:
 a) Linux distributions manage to have package repositories that are kept in a 
 consistent state
 b) Linux package managers can avoid dependency hell, automatically update to 
 new packages, etc.
 Linux people did it! Is there any technical issue that prevents Haskell 
 people from doing exactly 
 the same thing? Or are we just having non-technical problems like lack of 
 money or developers?
 

Well, one big issue is that Linux distribution packagers have control of
the entire stack. A (hypothetical) Haskell package manager wouldn't.

Typical package managers also restrict you to exactly one version of any
given package. This can be a severe limitation for developers.

(There are more issues.)



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Fwd: education or experience?

2012-12-09 Thread Bardur Arantsson

On 12/10/2012 01:20 AM, Eli Frey wrote:
 
 Jerzy makes a good point that you might not be the best judge of what you
 should learn.

Not only that: you have *no reliable way of knowing* what you might be
missing.

Any half-decent CS education gives you a very broad grounding in the
field so that you'll know where to look and what you need to read up on
when you find yourself stuck trying to tackle some problem. Without the
grounding there's a real risk that you might end up fighting windmills
or reinventing solutions that were already known in the 1970s.

Note: I am not saying that *formal* education is necessarily the only
way to get such a grounding, but it *is* a very reliable one assuming
that, a) you find a half-decent university, and b) it suits your
temprament, and c) you put in the requisite effort to learn about things
that may not be of immediate interest.

Regards,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How can I avoid buffered reads?

2012-11-28 Thread Bardur Arantsson

On 11/28/2012 08:38 PM, Leon Smith wrote:
 I have some code that reads (infrequently) small amounts of data from
 /dev/urandom,  and because this is pretty infrequent,  I simply open the
 handle and close it every time I need some random bytes.
 
 The problem is that I recently discovered that,  thanks to buffering within
 GHC,   I was actually reading 8096 bytes when I only need 16 bytes,  and
 thus wasting entropy.   Moreover  calling hSetBuffering  handle NoBuffering
 did not change this behavior.
 
 I'm not sure if this behavior is a bug or a feature,  but in any case it's
 unacceptable for dealing with /dev/urandom.   Probably the simplest way to
 fix this is to write a little C helper function that will read from
 /dev/urandom for me,  so that I have precise control over the system calls
 involved. But I'm curious if GHC can manage this use case correctly;
 I've just started digging into the GHC.IO code myself.
 

Use openFd, fdReadBuf and closeFd from the System.Posix.IO.ByteString
module in the 'unix' package.

Those correspond directly to system calls and are thus unbuffered.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] One of the new buzz phrases is Event-Sourcing; is Haskell suitable for this?

2012-09-30 Thread Bardur Arantsson

On 09/30/2012 02:46 AM, KC wrote:
 http://martinfowler.com/eaaDev/EventSourcing.html
 
 http://martinfowler.com/articles/lmax.html
 
 

Sure, why not? See

http://hackage.haskell.org/package/cqrs-0.8.0
and http://hackage.haskell.org/package/cqrs-example-0.8.0

for an example application.

I should note that the cqrs package API is by no means finalized;
there are some limitations(*) to the current implementation, but I've
not had time to actually get rid of those limitations.

(*) The major ones being the requirement for a global version number and
lack of streaming event sourcing.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] What is the surefire way to handle all exceptions and make sure the program doesn't fail?

2012-07-17 Thread Bardur Arantsson

On 07/17/2012 08:34 AM, Yifan Yu wrote:
 First of all, apologise if the question is too broad. The background goes
 like this: I've implemented a server program in Haskell for my company
 intended to replace the previous one written in C which crashes a lot (and
 btw the technology of the company is exclusively C-based).  When I chose
 Haskell I promised my manager (arrogantly - I actually made a bet with
 him), it won't crash. Now it has been finished (with just a few hundred
 LOC), and my test shows that it is indeed very stable. But by looking at
 the code again I'm a little worried, since I'm rather new to exception
 handling and there're many networking-related functions in the program. I
 was tempted to catch (SomeException e) at the very top-level of the program
 and try to recursively call main to restart the server in case of any
 exception being thrown, but I highly doubt that is the correct and
 idiomatic way. There are also a number of long-running threads launched
 from the main thread, and exceptions thrown from these threads can't be
 caught by the top-level `catch' in the main thread.
 My main function looks
 like this:
 
[--snip--]

 I find that I can't tell whether a function will throw any exception at
 all, or what exceptions will be thrown, by looking at their documentation.
 I can only tell if I browse the source code. So the question is, how can I
 determine all the exceptions that can be thrown by a given function?

Look at its source.

 And
 what is the best way to handle situations like this, with both the
 long-running threads and main thread need to be restarted whenever
 exceptions happen.
 

The most robust way is probably to use a completely independent
supervisor program, e.g. upstart, systemd, runit, etc. These
usually have facilities for restarting the supervised program, and a
rate limit on exactly how often to try that (over a given period of time).

These *won't* work for a program that's deadlocked because an important
thread has died. For that you'll need either a watchdog (external) or an
in-program mechanism for supervised threads which can catch any and
all exceptions and restart threads as necessary. This tends to very
domain-specific, but you might take some inspiration for the way
supervisor hierarchies work in the actor model.

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] What is the surefire way to handle all exceptions and make sure the program doesn't fail?

2012-07-17 Thread Bardur Arantsson

On 07/17/2012 10:17 PM, Christopher Done wrote:
 On 17 July 2012 22:10, Bardur Arantsson s...@scientician.net wrote:
 On 07/17/2012 08:34 AM, Yifan Yu wrote:
 I can only tell if I browse the source code. So the question is, how can I
 determine all the exceptions that can be thrown by a given function?

 Look at its source.
 
 Not sure that's the most productive comment. ;-P
 

Well, it's either that or the documentation, and if you want to be
*really* sure...

(I did realize that the OP did mention looking at the source, I just
thought I'd confirm. I hope it didn't come out snarky -- I certainly
didn't intend it to.)

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] haskell.org is so fragile

2012-07-12 Thread Bardur Arantsson

On 07/12/2012 11:04 PM, Ganesh Sittampalam wrote:
 Hi,
 
 On 12/07/2012 13:06, Takayuki Muranushi wrote:
 Today I have observed that hackage.haskell.org/ timeout twice (in the
 noon and in the evening.)

 What is the problem? Is it that haskell users have increased so that
 haskell.org is overloaded? That's very good news.
 I am eager to donate some money if it requires server reinforcement.
 
 The issue is unfortunately more to do with sysadmin resources than
 server hardware; there's noone with the time to actively manage the
 server and make sure that it's running well. Any ideas for improving the
 situation would be gratefully received.
 
 Today there were some problems with some processes taking up a lot of
 resources. One of the problems was the hackage search script, which has
 been disabled for now.
 

Since I don't have insight into the inner sanctum (aka.
service/server setup) this may be way off base, but how about adding a
Varnish instance in front of haskell.org and its various subdomains? It
could cache everything for 5 minutes (unconditional) and reduce load by
ridiculous amounts.

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] ghci and TH cannot: unknown symbol `stat64`

2012-07-11 Thread Bardur Arantsson

On 07/11/2012 05:12 PM, Michael Snoyman wrote:
 
 Thanks for the feedback. However, looking at sqlite3.c, I see the
 necessary #include statements:
 
 #include sys/types.h
 #include sys/stat.h
 #include unistd.h
 
 I'm confident that none of my code is making calls to stat/stat64 via
 the FFI. In case it makes a difference, this problem also disappears
 if I compile the library against the system copy of sqlite3 instead of
 using the C source.

You may need some extra defines, see the comments in man stat64.

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Most C++ compilers will not optimize x^2.0 as x*x but instead will do an expensive ...

2012-05-23 Thread Bardur Arantsson

On 05/24/2012 04:13 AM, Brandon Allbery wrote:
 On Wed, May 23, 2012 at 9:47 PM, KC kc1...@gmail.com wrote:
 
 exponentiation and logarithm.
 So, I believe this C++ versus Haskell versus (your language of choice) is
 a Penn  Teller misdirection.
 Whereas, another level of indirection solves everything.

 
 Is it me or is this style of message — content broken between subject and
 body, no reference information tying it to the presumed topic (or possibly
 a /non sequitur/) — better suited to Twitter than a mailing list?
 
 

This has come up before -- this KC person probably has a broken mail
client which doesn't set appropriate References headers.

@KC: Which mail client are you using?

... and could you please 1) (ideally) use a mail client which doesn't
screw up threading, or 2) (less ideally) avoid messing with the subject
line so that at least everybody else's mail client has that to go on for
threading purposes?

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-16 Thread Bardur Arantsson

On 05/16/2012 09:02 PM, Gregg Lebovitz wrote:
 Isaac,
 
 I was looking at the debian coding contest benchmarks referenced by
 others in this discussion. All of the benchmarks algorithms, appear to
 be short computationally intensive programs with a fairly low level of
 abstraction.
 
 In almost all examples, the requirement says: you must implement the X
 functions as implemented in Java, or C#, or C++. The k-nucleotide even
 specifies a requirement to use an update a hash-table.
 
 I wonder if someone here could come up with a set of benchmarks that
 would out perform C++.
 

That's easy:

 let ones = 1 : ones
 take 5 ones
[1,1,1,1,1]

I'm not sure how much C++ code you'd have to write to produce the
correct answer without butchering the intent too much, but the naïve
translation to C++ loops infinitely. Obviously Haskell is infintely
better than C++!1oneone!

 Interesting that you would focus on this one comment in my post and not
 comment on one on countering the benchmarks with a new criteria for
 comparing languages.

Comparing languages is a highly non-trivial matter involving various
disciplines (including various squidgy ones) and rarely makes sense
without a very specific context for comparison.

So the short answer is: mu. Discovering the long answer requires a
lifetime or more of research and may not actually result in an answer.

Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Learn you

2012-05-02 Thread Bardur Arantsson


On 05/02/2012 07:37 PM, Brandon Allbery wrote:

On 2 May 2012 18:18, Brent Yorgeybyor...@seas.upenn.edu  wrote:


I am curious how the title was translated.  Of course, the English
title Learn You a Haskell for Great Good uses intentionally
ungrammatical/unidiomatic English for humorous effect.  Is the



On Wed, May 2, 2012 at 1:25 PM, Colin Adamscolinpaulad...@gmail.comwrote:


I don't find it (the English title) humorous. I just assumed it was
written by a non-native English speaker.



The English title does require a little context for the humor:  it
leverages a chain of poor-translation memes going back (at least) to
all-your-base.



I always thought it was a nod to

   Borat: Cultural Learnings of America for Make Benefit Glorious 
Nation of Kazakhstan


Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Summer of Code idea: Haskell Web Toolkit

2012-03-06 Thread Bardur Arantsson


On 03/06/2012 11:38 PM, Christopher Done wrote:

I might as well chime in on this thread as it is relevant to my
interests. I made a write up on a comparison of HJScript (JavaScript
EDSL) and my Ji (control browser from Haskell) library:
https://github.com/chrisdone/ji

HJScript is OK, hpaste.org uses it here:
https://github.com/chrisdone/amelie/blob/master/src/Amelie/View/Script.hs
output here: http://hpaste.org/js/amelie.js



HJScript (0.5.0) generates invalid Javascript if you try to use 
anonymous functions.


(Digs through email archives... Ah, yes:)

 snip 
Given

 testJS :: HJScript ()
 testJS = do
   f - function (\(e :: JInt) - do
 x - inVar true
 return $ x)
   callProc f (int 3)
   return ()

 main :: IO ()
 main = do
   putStrLn $ JS:  ++ (show $ evalHJScript $ testJS)

We get the output

 function (param0_0){var var_1 = true;return var_1;}(3);

But this is invalid syntax in JavaScript, and should really be

 (function (param0_0){var var_1 = true;return var_1;})(3);

... which works.

 snip 

Just something to be aware of.

(For my particular usage it was also too strictly typed, but that's 
another matter.)


Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Reasons for Super-Linear Speedup

2012-03-05 Thread Bardur Arantsson


On 03/05/2012 04:58 PM, burak ekici wrote:

Dear List;

I have parallelized RSA decryption and encryption schemes
by using second generation strategies of Haskell p.l.

In that case, what I got in the sense of speed up was nearly
10 times of better performances (on a quad-core CPU with 8M cache)
in the parallel evaluation of 125K long plain text with 180-bit
of an encryption key, in comparison with the serial evaluation,
abnormally.



The explanation for this kind of thing is usually that all the working 
data suddenly fits within the per-CPU L2 cache when split up.


Regards,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] [haskell-cafe] Some reflections on Haskell

2012-02-14 Thread Bardur Arantsson


On 02/14/2012 04:13 PM, Doug McIlroy wrote:

Nevertheless, I share Jardine's concern about the central problem.
It is hard to find one's way in this ecosystem.  It needn't be,
as Java illustrates.



As a professional Java developer this sounds really strange, but maybe I 
just haven't found it yet.


Do you mean to suggest that it's easy to choose between e.g. Spring, 
Guice, etc., the umpteen OSGi containers, the several logging 
frameworks, etc. etc. etc.?




 To my mind Java's great contribution to theworld is its library index--light 
years ahead of typical
documentation one finds at haskell.org, which lacks the guiding
hand of a flesh-and-blood librarian.  In this matter, it
seems, industrial curation can achieve clarity more easily than
open source.


Which library index? If you're talking about API documentation, then I 
must confess that I much prefer the Hackage docs.


I'm *really* confused by your post and really can't recognize the Java 
you're talking about.


Can you give us pointers to concrete examples?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Subject: A universal data store interface

2012-02-13 Thread Bardur Arantsson


On 02/13/2012 09:36 PM, Michael Snoyman wrote:

You make it sound like your options are use the crippled abstraction
layer or use the full-powered database layer. You're leaving out
two very important points:

1. There's a reason the abstraction layer exists: it can be clumsy to
go directly to the full-powered database for simple stuff.


That's simply a reason to make access to the *full-powered database* 
easier, not a reason to make access to *every database* identical. Doing 
that is a mistake *unless* you're going to avoid SQL entirely but 
somehow still retain the full database power. For example, SQLite 
requires entirely different SQL contortions to get certain types of 
fields in query results from the way PostgreSQL does it. That means 
you'll have to change your program a lot even if you use e.g. HDBC for 
database access.


My experience is roughly similar to Paul R's. You often give up too much 
by going with generic ORM and such.


That's not to say you can't make working with each particular DB much 
more pleasant that it is currently -- postgresql-libpq, for example, is 
almost useless as an application-level API, and I'm working on (no 
guarantees!) a little postgresql-libpq-conduit thingy which will 
hopefully make issuing queries and iterating over results a much more 
pleasant experience without burdening you will all kinds of ridiculously 
low-level detail, and at the same time will NOT shield you from the 
low-level detail that actually *matters*.


The Database Supported Haskell stuff 
(http://hackage.haskell.org/package/DSH) also seems relevant to this 
discussion, since this does seem like it could actually leverage the 
immense power of (some) databases without having to bother too much with 
low-level DB access.



2. You can bypass the abstraction layer whenever you want.

I like to describe Persistent's goal as doing 95% of what you need,
and getting out of your way for the other 5%. You can write raw SQL
queries with Persistent. I use this for implementing full-text search.
I haven't needed to write deep analytical tools recently, but if I
did, I would use raw SQL for that too.


Yes, but then you end up being fully tied to the database *anyway*, so 
why not just make *that* easier and safer from the start?


(I realize that this is a hard problem in practice. It's certainly NOT 
small enough for a GSoC, IMO.)




Persistent's advantage over going directly to the database is concise,
type-safe code. Are you really telling me that `runSql SELECT * FROM
foo where id=? [someId]` plus a bunch of marshal code is better then
`get someId`?



For starters you should probably never do a SELECT * (which is what 
one assumes Persistent would/will do) -- on an SQL database the 
performance characteristics and locking behavior may change dramatically 
over time... while on $generic-NoSQL database there may not really any 
other option, and the performance characteristics *won't* necessarily 
change too dramatically. This is an example of why introducing something 
like Persistent (or ORMs in general) may be a non-trivial decision.


Besides, you probably won't need a lot of marshalling code if you know 
what the query result field types are going to be (you should!). You 
just pattern-match, e.g.


  processQueryResult [SqlInteger i, SqlByte j, SqlText x] =
   ... -- do whatever to i,j and x
  processQueryResult _ = error Invalid columns in query result

Yes, this means you'll need to know exactly how the table was created 
(but not in the case of SQLite -- there you MAY have to add various 
casts to the SQL or to manually convert from SqlText to your intended 
Haskell datatype).


I don't think anyone denies that having a compile-time guarantee of a 
successful match would be a bad thing.


It's just that this are is far more complicated than people give it 
credit for.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Need advice: Haskell in Web Client

2012-01-26 Thread Bardur Arantsson


On 01/26/2012 11:16 AM, dokondr wrote:

On Thu, Jan 19, 2012 at 1:37 AM, Dag Odenhalldag.odenh...@gmail.comwrote:



On Tue, 2012-01-17 at 22:05 +0300, dokondr wrote:


I prefer using Turing complete PL to program web client, like the one

used

in GWT (Java) or Cappuccino  (Objective-J). http://cappuccino.org/learn/
In this case you /almost/ don't need to know  HTML, CSS, DOM, Ajax, etc.

to

develop WebUI and good PL lets you concentrate on problem domain instead

of

bothering about browser support.
It is a real pity that Haskell still has no such tools to generate Web

GUI

in Javascript. (((


Have you seen Chris Done's posts on the subject?

http://chrisdone.com/tags/javascript.html



Thanks for the link! (Never seen this before)
Ideally, I would be happy to be able to write in Haskell a complete
front-end / GUI, so it could be compiled to different back-ends: Javascript
to run in the Browser and also a standalone app.
In Python world this is already done with Pyjamas (http://pyjs.org/) - a
Rich Internet Application (RIA) Development Platform for both Web and
Desktop.
Also from Pyjamas site:
Pyjamas ... contains a Python-to-Javascript compiler, an AJAX framework
and a Widget Set API.
Pyjamas Desktop is the Desktop version of Pyjamas
Pyjamas Desktop allows the exact same python web application source code to
be executed as a standalone desktop application (running under Python)
instead of being stuck in a Web browser.

Architecture diagram
http://pyjs.org/wiki/pyjamasandpyjamasdesktop/

I wonder if somebody works on similar Haskell Rich Internet Application
(RIA) Development Platform ?
Any ideas, comments on implementation of such system in Haskell? What
existing Haskell GUI libraries can be used for a desktop GUI, etc.?



Well, it's basically just proof-of-concept at the moment, and it's not 
really usable for real applications at the moment, but there is


   http://hackage.haskell.org/package/dingo-core-0.1.0
   http://hackage.haskell.org/package/dingo-widgets-0.1.0
   http://hackage.haskell.org/package/dingo-example-0.1.0

The basic client-server communication, server-side state handling, 
etc. is there, but it's missing a couple of things before it could be 
used for real apps: There's no real security, and there are *very* few 
widgets. The few widgets that exist at the moment are also probably 
lacking a few operations. On the plus side, it's should be pretty easy 
to create new widgets.


You can get a feel for how the thing looks from an application 
programmer's perspective by looking at the source for the example.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How to make asynchronous I/O composable and safe?

2012-01-14 Thread Bardur Arantsson


On 01/14/2012 11:42 AM, Joey Adams wrote:

On Sat, Jan 14, 2012 at 1:29 AM, Bardur Arantssons...@scientician.net  wrote:

So, the API becomes something like:

   runSocketServer :: ((Chan a, Chan b) -  IO ()) -  ... -  IO ()

where the first parameter contains the client logic and A is the type of
the messages from the client and B is the type of the messages which are
sent back to the client.


Thanks, that's a good idea.  Even if I only plan to receive in one
thread, placing the messages in a Chan or TChan helps separate my
application thread from the complexities of connection management.



Unless TCP is an absolute requirement, something like 0MQ[1,2] may be 
worth investigating.


It handles all the nasty details and you get a simple message-based 
interface with lots of nice things like pub-sub, request-reply, etc. etc.


[1] http://hackage.haskell.org/package/zeromq-haskell-0.8.2
[2] http://www.zeromq.org/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How to make asynchronous I/O composable and safe?

2012-01-13 Thread Bardur Arantsson


On 01/14/2012 06:24 AM, Joey Adams wrote:

I'm not happy with asynchronous I/O in Haskell.  It's hard to reason
about, and doesn't compose well.  At least in my code.


[--snip--]

Async I/O *is* tricky if you're expecting threads to do their own 
writes/reads directly to/from sockets. I find that using a 
message-passing approach for communication makes this much easier.


If you need multiple server threads to respond to the same client 
(socket) then the easiest approach might be to simply use a (Chan a) for 
output. Since you always put full messages to the Chan, exceptions cause 
no problems with respect to partial messages, etc.


You can also use a Chan for forwarding messages from the client socket 
to the appropriate server threads -- if you need several (or even all) 
threads to receive messages from the client you can use dupChan on the 
input-from-client channel you pass to the server threads.


So, the API becomes something like:

   runSocketServer :: ((Chan a, Chan b) - IO ()) - ... - IO ()

where the first parameter contains the client logic and A is the 
type of the messages from the client and B is the type of the messages 
which are sent back to the client.


Hope this helps,



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Solved but strange error in type inference

2012-01-03 Thread Bardur Arantsson


2012/1/3 Yucheng Zhangyczhan...@gmail.com



(Hopefully being a little more explicit about this can help you 
understand where things are going wrong.)


[--snip--]


legSome :: LegGram nt t s -  nt -  Either String ([t], s)


The 'nt' you see above


legSome (LegGram g) ntV =
  case Main.lookup ntV g of
 Nothing -  Left No word accepted!
 Just l -  let sg = legSome (LegGram (Main.delete ntV g))
   subsome :: [RRule nt t s] -  Either String ([t], s)


... isn't the same as the 'nt' you see in this line, so it constrains 
'subsome' to a different type than the one you intended -- and indeed 
one which can't be unified with the inferred type. (Unless you use 
ScopedTypeVariables.)


As Brent suggested, you should probably pull the subsome function out 
into a top-level function in any case.


Cheers,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Is the haddock generator on Hackage down?

2012-01-03 Thread Bardur Arantsson


Hi all,

No Haddock documentation seems to have been generated on Hackage in the 
past few days. See e.g.


   http://hackage.haskell.org/package/copilot-2.0.3
or http://hackage.haskell.org/package/derive-2.5.5

Does anyone know if the cron job (or whatever) isn't running for some 
reason?


Cheers,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] On the purity of Haskell

2011-12-30 Thread Bardur Arantsson


On 12/29/2011 07:07 PM, Steve Horne wrote:

On 29/12/2011 10:05, Jerzy Karczmarczuk wrote:

Sorry, a long and pseudo-philosophical treatise. Trash it before reading.

Heinrich Apfelmus:

You could say that side effects have been moved from functions to
some other type (namely IO) in Haskell.

I have no reason to be categorical, but I believe that calling the
interaction of a Haskell programme with the World - a side effect is
sinful, and it is a source of semantical trouble.

People do it, SPJ (cited by S. Horne) did it as well, and this is too
bad.
People, when you eat a sandwich: are you doing side effects?? If you
break a tooth on it, this IS a side effect, but neither the eating nor
digesting it, seems to be one.


By definition, an intentional effect is a side-effect. To me, it's by
deceptive redefinition - and a lot of arguments rely on mixing
definitions - but nonetheless the jargon meaning is correct within
programming and has been for decades. It's not going to go away.



This doesn't sound right to me. To me, a side effect is something 
which happens as a (intended or unintended) consequence of something 
else. An effect which you want to happen (e.g. by calling a procedure, 
or letting the GHC runtime interpreting an IO Int) is just an effect.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] On the purity of Haskell

2011-12-30 Thread Bardur Arantsson


On 12/29/2011 11:06 PM, Steve Horne wrote:

On 29/12/2011 21:01, Chris Smith wrote:

On Thu, 2011-12-29 at 18:07 +, Steve Horne wrote:

By definition, an intentional effect is a side-effect. To me, it's by
deceptive redefinition - and a lot of arguments rely on mixing
definitions - but nonetheless the jargon meaning is correct within
programming and has been for decades. It's not going to go away.

Basically, the jargon definition was coined by one of the pioneers of
function programming - he recognised a problem and needed a simple way
to describe it, but in some ways the choice of word is unfortunate.

I don't believe this is true. Side effect refers to having a FUNCTION
-- that is, a map from input values to output values -- such that when
it is evaluated there is some effect in addition to computing the
resulting value from that map. The phrase side effect refers to a
very specific confusion: namely, conflating the performing of effects
with computing the values of functions.

Yes - again, by definition that is true. But that definition is not the
everyday definition of side-effect.


 Repeating and explaining one

definition doesn't make the other go away.



That's what you seem to be doing a lot in this thread. It's very hard to 
glean what *exactly* you're trying to argue since you seem to be all 
over the place.


(I hope this isn't taken as an insult, it certainly isn't meant as one.)

Maybe a summary of your argument + counter-arguments (as you understand 
them) on a wiki would be helpful? Mail threads with 40+ posts aren't 
really useful for hashing out this kind of thing.




1. To say that the C printf function has the side-effect of printing to
the screen - that's true.


No, it has the effect of printing to the screen. When you call 
printf() you *intend* for it to print something.



2. To say that the C printf function has no side-effects because it
works correctly - the only effects are intentional - that's also true.



I realize this is nitpicking, but all of its effects may not be 
intentional. For example, given certain terminal settings it may also 
flush the buffer if you have a newline in the argument string. That's a 
side effect (may be desirable/undesirable).


[--snip--]

Using similar mixed definitions to conclude that every C program is full
of bugs (basically equating intentional effects with side-effects, then
equating side-effects with unintentional bugs) is a fairly common thing
in my experience, but it's a logical fallacy. If you aren't aware of the
two definitions of side-effect, it's hard to get deal with that.

Some people don't want anyone to figure out the fallacy - they like
having this convenient way to attack C, irrespective of whether it's
valid or not. Rare I think - mostly it's more confusion and memetics.
But still, I'm convinced there's some sophistry in this. And I'm not the
only person to think so, and to have reacted against that in the past.

Extra sad - you don't need that fallacy to attack C. It's redundant. C
is quite happy to demonstrate its many failings.


That's the flimsiest straw man I've ever seen.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] On the purity of Haskell /Random generators

2011-12-30 Thread Bardur Arantsson


On 12/29/2011 09:39 PM, Jerzy Karczmarczuk wrote:


Truly random numbers are very rarely used, forget about them.


Well, obviously, but why should we forget about them? The usual 
approach(*) is to gather entropy from a truly(**) random source

and use that to seed (and perhaps periodically re-seed) a PRNG.

(*) At least as far as I understand it.
(**) At least one believed to be truly random.

My point was simply to make clear the distinction between RNG vs. PRNG.


Standard r. generators (pseudo-random) in Haskell are monadic, because
the relevant algorithms are stateful.
Congruential, Fibonacci, Mersenne Twister, whatever, is a function, more
or less:
(newValue,newSeed) = rgen seed

The monadic approach serves mainly to hide the seed.
Some people prefer to use random streams, no monads, so the question of
Steve Horne is not universal.


Random streams are not referentially transparent, though, AFAICT...?

Either way this thread has gone on long enough, let's not prolong it 
needlessly with this side discussion.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] (...) Random generators

2011-12-30 Thread Bardur Arantsson


On 12/30/2011 04:38 PM, Jerzy Karczmarczuk wrote:
 Bardur Arantsson:
 Random streams are not referentially transparent, though, AFAICT...?

 Either way this thread has gone on long enough, let's not prolong it
 needlessly with this side discussion.

 Sure.
 But the discussion on randomness is /per se/ interesting, especially in
 a functional setting.

 Anyway, nobody can convince Steve Horne. Perhaps as an unintentional
 side-effect...

 But random streams, or rather pseudo-random streals (infinite lazy
 lists, as the example I gave, the `iterate` of `next`) are as
 referentially transparent as any Haskell data. Really.


Of course -- if you just have a starting seed and the rest of the 
sequence is known from there. I was thinking of e.g. those periodic 
re-initialization ways of doing RNG.


 I *NEVER* used
 true random numbers, even to initialize a generator, since in the
 simulation business it is essential that you can repeat the sequence on
 some other platform, with some other parameters, etc.


I've heard this a lot from physicists -- of course if you run a 
simulation reproducibility can be extremely important (e.g. for 
double-checking computations across different machines). However, if 
you're doing crypto it may not be so desirable :).


Anyway, I'm out of this thread too :).

Cheers,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] On the purity of Haskell

2011-12-30 Thread Bardur Arantsson


On 12/30/2011 10:10 PM, Steve Horne wrote:

On 30/12/2011 10:47, Bardur Arantsson wrote:

On 12/29/2011 11:06 PM, Steve Horne wrote:

Calling it a straw man won't convince anyone who has the scars from
being attacked by those straw men.

I've been in those arguments, being told that C has side-effects
therefore all C programs are full of bugs, whereas Haskell can't have
similar bugs because it doesn't have side-effects.

[--snip--]

Please stop or quote someone.



I'm really not interested in whose-side-are-you-on arguments. Trying to
keep the two definitions separate is relevant, and that was my
motivation for saying this - it's a fact that if you mix your
definitions up enough you can prove anything.



Yes, and if you throw up enough verbiage or move goalposts enough you 
(impersonal) can tire anyone. That doesn't prove anything.



I like C++. I recognise the flaws in C++, as every everyday-user of the
language must. Pretending they don't exist doesn't solve the issues -
it's for OTT advocates, not developers. I don't insist that every
virtuous-sounding term must apply to C++. I don't pretend every C++
advocate is an angel.


I dislike C++. There's one reason for that: Undefined behavior. 
Haskell still has some of that, but as long as you steer clear of 
unsafePerformIO, you're mostly good.



I like Haskell. I can't claim to be an everyday user, but I'm learning
more and using it more all the time. I'm still uncertain whether some
flaws I see are real - some that I used to see weren't - but I'll
address that over time by thinking and debating. I won't pretend every
Haskell advocate is an angel.


I really don't care if you like or dislike Haskell, nor does anyone else 
AFAICT. Thinking is good. Debating is also fine as long as you're 
prepared to listen what people are saying.


[--snip--]

Conal Elliot was right -- at least about the debate part :)

That really *is* my last post on this thread.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] On the purity of Haskell

2011-12-29 Thread Bardur Arantsson


On 12/29/2011 08:47 PM, Steve Horne wrote:

On 29/12/2011 19:21, Heinrich Apfelmus wrote:



BTW - why use an IO action for random number generation? There's a
perfectly good pure generator. It's probably handy to treat it
monadically to sequence the generator state/seed/whatever but random
number generation can be completely pure.


*Pseudo* random number generation can of course be pure (though 
threading the state would be tedious and error-prone). If you want truly 
random numbers you cannot avoid IO (the monad).



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] If you'd design a Haskell-like language, what would you do different?

2011-12-22 Thread Bardur Arantsson

Alexander Solla wrote:

 I happen to only write Haskell programs that terminate.  It is not that
 hard.  We must merely restrict ourselves to the total fragment of the
 language, and there are straight-forward methods to do so.

Do (web/XML-RPC/whatever) server type programs terminate?




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] What happens if you get hit by a bus?

2011-12-16 Thread Bardur Arantsson

Michael Litchard wrote:

[--snip--]

If getting hit by a bus is a significant factor in the overall outcome of 
the project then I think those are pretty good odds, aren't they?

(I do realize that traffic accidents are a lot more frequent than we like to 
think, but still...)

-- 
Bárður Árantsson



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] What happens if you get hit by a bus?

2011-12-16 Thread Bardur Arantsson

Andrew Coppin wrote:

 On 16/12/2011 07:05 PM, Bardur Arantsson wrote:
 Michael Litchard wrote:

 [--snip--]

 If getting hit by a bus is a significant factor in the overall outcome of
 the project then I think those are pretty good odds, aren't they?

 (I do realize that traffic accidents are a lot more frequent than we like
 to think, but still...)
 
 The /actual/ probability of being hit by a bus is irrelevant. The only
 thing of concequence is the /percieved/ probability. This latter
 quantity is not related to the former in any meaningful way. In fact,
 due to an effect known as availability bias, the probability of any
 potential threat varies dependi
 ng on how long you spend thinking about it.

[snip blah blah blah]

- Not to be rude, but... (*)

That was the point of my post.

If you're actually confronted with this perception that traffic accidents 
are relevant to project success, you're already in deep manure because 
there's so much more than code in a project. That's what you need to 
explain.

Code is the means of getting us to an end. It seems these people are 
worring about the means when the big problem is actually conveying the 
ends.

(Again, just my take on the situation, I'm not claiming canonicity or 
anything.)

-- 
Bárður Árantsson

(*) I realize that this is rude. I can only apologize.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] ANNOUNCE: hxournal-0.5.0.0 - A pen notetaking program written in haskell

2011-12-12 Thread Bardur Arantsson


On 12/13/2011 02:43 AM, Brandon Allbery wrote:

On Mon, Dec 12, 2011 at 19:22, Ian-Woo Kimianwoo...@gmail.com  wrote:


A workaround is to make a symbolic link to libstdc++.so.6 with the
name of libstdc++.so in /usr/lib or /usr/local/lib or other dynamic
library path like the following.

ln -s /usr/lib/libstdc++.so.6 /usr/lib/libstdc++.so



AFAICT, this is incorrect and should be something like

   ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /usr/lib/libstdc++.so

(where x86_64-linux-gnu depends on your platform).

Normally this isn't a problem since to above-mentioned directory is in 
ld.so.conf, but that (apparently) isn't handled correctly by GHC.


A less permanent workaround is to just add

   /usr/lib/x86_64-linux-gnu

to your LD_LIBRARY_PATH in your environment before running anything GHC 
related.



This is an indication that you have not installed your distribution's -dev
package for the library in question.  You should do so instead of making
the symlink manually.



Many distros have started to *not* install a /usr/lib/libstdc++.so 
symlink (nor even any /usr/lib/libstdc++*.so files at all) in /usr/lib,
preferring to use the above-mentioned directory instead and listing it 
in /etc/ld.so.conf.


It has something to do with getting saner multilib (and multiarch?) support.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] A small Darcs anomoly

2011-04-28 Thread Bardur Arantsson


On 04/28/2011 12:19 AM, Ganesh Sittampalam wrote:

On 26/04/2011 12:17, Malcolm Wallace wrote:


On 25 Apr 2011, at 11:13, Andrew Coppin wrote:


On 24/04/2011 06:33 PM, Jason Dagit wrote:


This is because of a deliberate choice that was made by David Roundy.
In darcs, you never have multiple branches within a single darcs
repository directory tree.


Yes, this seems clear. I'm just wondering whether or not it's the best design 
choice.


It seems to me to be a considerable insight.  Branches and repositories are the 
same thing.  There is no need for two separate concepts.  The main reason other 
VCSes have two concepts is because one of them is often more efficiently 
implemented (internally) than the other.  But that's silly - how much better to 
abstract over the mental clutter, and let the implementation decide how its 
internals look!

So in darcs, two repositories on the same machine share the same files (like a 
branch), but if they are on different machines, they have separate copies of 
the files.  The difference is a detail that you really don't need to know or 
care about.


It does mean that you duplicate information. You have [nearly] the same set of 
patches stored twice,


No, if on the same machine, the patches only appear once, it is just the index 
that duplicates some information (I think).  In fact just as if it were a 
branch in another VCS.


Unfortunately, I don't think this is quite true, because being able to
switch between multiple branches in the same working directory means you
can reuse build products when switching branches. Depending on how
radical the branch shift is, this can be a substantial win, and it's the
main reason that darcs might in future implement in-repo branching of
some form.



There's also the fact that using in-repo branches means that all the 
tooling doesn't have to rely on any (fs-specific) conventions for 
finding branches.


As someone who has admin'd a reasonably large Bazaar setup (where branch 
== directory similarly to Darcs) I can honestly say that this would be a 
HUGE boon.


Cheers,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Monads and Functions sequence and sequence_

2010-10-29 Thread Bardur Arantsson


On 2010-10-30 07:07, Mark Spezzano wrote:

Hi,

Can somebody please explain exactly how the monad functions sequence and 
sequence_ are meant to work?

I have almost every Haskell textbook, but there's surprisingly little 
information in them about the two functions.

 From what I can gather, sequence and sequence_ behave differently 
depending on the types of the Monads that they are processing. Is this correct? Some concrete 
examples would be really helpful.



sequence [m1,m2,m3,m4,...] = do
  x1 - m1
  x2 - m2
  x3 - m3
  x4 - m4
  ...
  return [x1,x2,x3,x4,...]

sequence_ [m1,m2,m3,m4,...] = do
  m1
  m2
  m3
  m4
  ...
  return ()

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Sockets get bound to wrong port on Windows

2010-06-02 Thread Bardur Arantsson


On 2010-06-03 05:10, Matthias Reisner wrote:

Hi,

there's something wrong with port numbers in the Network.Socket module
of package network. Printing values gives:

*Main PortNum 
47138
*Main PortNum 47138




Try

   (fromIntegral ) :: PortNumber

(Yes, it's weird.)

Cheers,

Bardur

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-03-25 Thread Bardur Arantsson


On 2010-02-24 20:50, Brandon S. Allbery KF8NH wrote:

tcpdump 'host ps3 and tcp[tcpflags]  0x27 != 0'


(Indulging in some serious thread necromancy here, but...)

Alright, I've _finally_ got round to doing a dump with leaking file 
descriptors (or threads as the case may be).


The bits of lsof output of the leaked file descriptors looks like this 
(sorry about the wrapping):


hums   2084 bardur   36u sock0,6   0t0 
45022400 can't identify protocol
hums   2084 bardur   37r  REG   0,23 733054976 
 267762 THE_MOVIE.avi


So pairs of FDs are being held up by sendfile when the PS3 disconnects. 
The number of such pairs in this test run was 16.


I've attached the gzipped output from tcpdump.

The only striking thing I can see about the dump is that there are 22 
(conspicuously close to 16) sequences like:


19:45:30.135291 IP 192.168.0.115.64931  gwendolyn.9000: Flags [R], seq 
2112225068, win 0, length 0
19:45:30.135295 IP 192.168.0.115.64931  gwendolyn.9000: Flags [R], seq 
2112225068, win 0, length 0
19:45:30.135299 IP 192.168.0.115.64931  gwendolyn.9000: Flags [R], seq 
2112225068, win 0, length 0
19:45:30.135302 IP 192.168.0.115.64931  gwendolyn.9000: Flags [R], seq 
2112225068, win 0, length 0


My tcpdump-fu is rather limited, so I'm not really sure what this 
actually means... any input much appreciated.


Cheers,


dump_with_leaking_fds.log.gz
Description: application/gzip
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-24 Thread Bardur Arantsson


On 2010-02-24 05:10, Brandon S. Allbery KF8NH wrote:

On Feb 21, 2010, at 20:17 , Jeremy Shaw wrote:

The PS3 does do something though. If we were doing a write *and* read
select on the socket, the read select would wakeup. So, it is trying
to notify us that something has happened, but we are not seeing it
because we are only looking at the write select().


Earlier the OP claimed this would happen within a few minutes if he
seeked in a movie. If it's that reproducible, it should be easy to
capture a tcpdump and attach it to an email (or pastebin it), allowing
us to determine what really happens.


It's a huge amount of data since it's streaming ~900Kb/s (or 
thereabouts). I don't think it's really practical to look through all 
that to try to figure out exactly when the problem occurs.


Anyone know of any programs which can highlight 'anomalous' tcp traffic 
in tcpdumps?


Still, I'd be happy to try a capture and upload it somewhere if anyone 
cares too have a look at it. It'll have to wait for the weekend, though.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-21 Thread Bardur Arantsson


Jeremy Shaw wrote:

Hello,

I think to make progress on this bug we really need a failing test case that
other people can reproduce.

I have hacked up small server that should reproduce the error (using fdWrite
instead of sendfile). And a small C client which is intended to reproduce
the error -- but doesn't.

I have attached both.

The server tries to write a whole lot of 'a' characters to the client. The
client does not consume any of them. This causes the server to block on the
threadWaitWrite.

No matter how I kill the client, threadWaitWrite always wakes up.


Are you running the client and server on different physical machines? If 
so, have you tried simply yanking the connection?


Your client isn't dropping the connection hard -- if you kill the client 
(even with a -9) your OS cleans up any open sockets it has. On 
well-behaved OS'es that cleanup usually involves properly shutting down 
the connection somehow. Different OS'es have different ideas about what 
constitutes properly shutting down the connection -- some simply don't.


My hypothesis is that the PS3 doesn't properly shut down the connection, 
but simply sends a RST (or maybe a FIN) and drops any further packets. 
I'll do a Wireshark dump after posting this to see if I can see what 
it's doing at the TCP level -- I'm not optimistic about seeing the exact 
moment when the leak occurs, but maybe the general pattern can yield 
some useful ideas.


I have no idea how to test this without using an actual PS3.

 So, we

need to figure out exactly what the PS3 is doing differently that causes
threadWaitWrite to not wakeup..


Does it matter? I can reproduce this reliably within a few minutes of 
testing.


Note that this doesn't happen *every* time the PS3 disconnects and 
reconnects, it just happens some of the time. It's enough to eat up 
MAX_FDs file descriptors in a few hours of playing media normally. If I 
do a lot of seeking (forces a disconnect+reconnect) through the movie, 
at least one file descriptor usually leaks within a few minutes.



If we don't know why it is failing, then I
don't think we can properly fix it.


I'm more pragmatic: If, after applying a fix, I cannot reproduce this 
problem within a few hours (or so) or running my media server, I'd say 
it's fixed. As long as the modifications to the sendfile library don't 
change its behavior in other ways, I don't see the problem.


P.S. Does anyone else out there have a PS3 to test with?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-21 Thread Bardur Arantsson


Taru Karttunen wrote:

Excerpts from Bardur Arantsson's message of Wed Feb 17 21:27:07 +0200 2010:
For sendfile, a timeout of 1 second would probably be fine. The *ONLY* 
purpose of threadWaitWrite in the sendfile code is to avoid busy-waiting 
on EAGAIN from the native sendfile.


Of course this will kill connections for all clients that may have a
two second network hickup.



I'm not talking about killing the connection. I'm talking about retrying 
sendfile() if threadWaitWrite has been waiting for more than 1 second.


If the connection *has already been closed* (as detected by the OS), 
then sendfile() will fail with EBADF, and we're good.


If the connection *hasn't been closed*, there are two cases:

  a) Sendfile can send more data, in which case it does and we go back 
to sleep on a threadWaitWrite.
  b) Sendfile cannot send more data... in which case the sendfile 
library gets an EAGAIN and goes back to sleep on a threadWaitWrite.


I don't see how that would lead to anything like what you describe.

How so? As a user I expect sendfile to work and not semi-randomly block 
threads indefinitely.


If you want sending something to terminate you will add a timeout to
it. A nasty client may e.g. take one byte each minute and sending your
file may take a few years.



This can always happen. The timeout here is at the application level 
(e.g. has this connection been open too long) and sendfile shouldn't 
concern itself with such things.


The point with my proposed fix is that sendfile will be reacting to the 
OS's detection of a dropped connection ASAP (plus 1 second) rather than 
just hanging.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-17 Thread Bardur Arantsson


Jeremy Shaw wrote:

On Wed, Feb 17, 2010 at 2:36 AM, Taru Karttunen tar...@taruti.net wrote:


So for sendfile, instead of threadWaitWrite we could do:

 r - timeout (60 * 10^6) threadWaitWrite
 case r of
   Nothing - ... -- timed out
   (Just ()) - ... -- keep going



For sendfile, a timeout of 1 second would probably be fine. The *ONLY* 
purpose of threadWaitWrite in the sendfile code is to avoid busy-waiting 
on EAGAIN from the native sendfile.


What would work is, instead of using threadWaitRead (as in the code you 
supplied) to simply have a 1 second timeout which causes the loop to 
call the native sendfile again. Native sendfile *will* fail with an 
error code if the socket has been disconnected.


With that in place dead threads waiting on threadWaitWrite will only 
linger at most 1 second before discovering the disconnect.


Not ideal, but a lot better than the current situation.


Does that sound like the right fix to you?


[--snip--]


(Obviously, if people are using sendfile with something other than happstack,
it does not help them, but it  sounds like trying to fix things in

 sendfile is misguided anyway.)




How so? As a user I expect sendfile to work and not semi-randomly block 
threads indefinitely.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-16 Thread Bardur Arantsson


Jeremy Shaw wrote:

On Sun, Feb 14, 2010 at 2:04 PM, Bardur Arantsson s...@scientician.netwrote:



Not sure what the best solution for this would be, API-wise... Maybe
actually have sendfile read the data and supply it to a user-defined
function which could react to the data in some way? (Could supply two
standard functions: disconnect immediately and accumulate all received
data into a bytestring.)



I think this goes beyond just a sendfile issue -- anyone trying to write
non-blocking network code should run into this issue, right ? For now, maybe
we should patch sendfile with what we have. But I think we really need to
summarize our findings, see if we can generate a test case, and then see
what Simon Marlow and company have to say...


As far as I can tell, all nonblocking networking code is vulnerable to 
this issue (unless it actually does use threadWaitRead, obviously :)).


In particular, I would imagine most of the Haskell HTTP servers are 
vulnerable to this since they do use the same pattern of:


  1) read all the input from the client connection,
  2) send all the output to the client connection

where there is no reading from the socket in step 2.

I'm just not sure whether the GHC built-in I/O code *somehow*
avoids this problem. I think my tests indicate that it does, so it would 
seem that it's only when you go C that you need to worry.


Re: a test case, you'll probably need to run the test case code on a 
client whose OS allows (from userspace) the sudden dropping of 
connections without sending a proper connection shutdown sequence. I'm 
not sure that that OS would be.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-16 Thread Bardur Arantsson


Bardur Arantsson wrote:

Jeremy Shaw wrote:

[--snip--]
Re: a test case, you'll probably need to run the test case code on a 
client whose OS allows (from userspace) the sudden dropping of 
connections without sending a proper connection shutdown sequence. I'm 
not sure that that OS would be.


Actually, scratch that. Maybe it's just a question having a high enough 
connection rate to hit the case where threadWaitWrite hangs. Although 
I did try a few times using wget, I didn't really try hammering the 
server properly. It probably needs the right timing to trigger the 
problem (i.e. the disconnect needs to happen exactly when sendfile is 
done with its block and we're going around to threadWaitWrite again.)


I'll see if I get the time try a test client which can really hammer my 
server -- that ought to be able to trigger the problem. If that works, 
I'll try to produce a minimal server program which still exhibits the issue.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-16 Thread Bardur Arantsson


Taru Karttunen wrote:

Excerpts from Bardur Arantsson's message of Tue Feb 16 22:57:23 +0200 2010:
As far as I can tell, all nonblocking networking code is vulnerable to 
this issue (unless it actually does use threadWaitRead, obviously :)).


There are a few easy fixes:

1) socket timeouts with Network.Socket.setSocketOption


The whole point of this thread is that this isn't sufficent.


2) just make your server code have timeouts in Haskell

This cannot be fixed in the sendfile library, it is a 
feature of TCP that connections may linger for a long

time unless explicit timeouts are used.


The problem is that the sendfile library *doesn't* wake
up when the connection is terminated (because of threadWaitWrite)
-- it doesn't matter what the timeout is.

Client code of the sendfile library shouldn't have to try
to work around this -- it's absurd to expect it to.

Please read the entire thread.

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-14 Thread Bardur Arantsson


Jeremy Shaw wrote:


import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types

data RW = Read | Write

threadWaitReadWrite :: Fd - IO RW
threadWaitReadWrite fd =
  do m - newEmptyMVar
 rid - forkIO $ threadWaitRead fd   putMVar m Read
 wid - forkIO $ threadWaitWrite fd  putMVar m Write
 r - takeMVar m
 killThread rid
 killThread wid
 return r


[--snip--]

I've tested this extensively during this weekend and not a single 
leaked FD so far.


I think we can safely say that polling an FD for read readiness is 
sufficient to properly detect a disconnected client regardless of 
why/how the client disconnected.


The only issue I can see with just dropping the above code directly into 
the sendfile library is that it may lead to busy-waiting on EAGAIN *if* 
the client is actually trying to send data to the server while it's 
receiving the file via sendfile(). If the client sends even a single 
byte and the server isn't reading it from the socket, then 
threadWaitRead will keep returning immediately since it's 
level-triggered rather than edge triggered.


In the worst case this could be exploited by evil clients as a trivial 
way to DoS a server -- simply send data while the server is sending you 
a file. Bam, instant 100% CPU utilization on the server.


Not sure what the best solution for this would be, API-wise... Maybe 
actually have sendfile read the data and supply it to a user-defined 
function which could react to the data in some way? (Could supply two 
standard functions: disconnect immediately and accumulate all 
received data into a bytestring.)


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-12 Thread Bardur Arantsson


Jeremy Shaw wrote:


import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types

data RW = Read | Write

threadWaitReadWrite :: Fd - IO RW
threadWaitReadWrite fd =
  do m - newEmptyMVar
 rid - forkIO $ threadWaitRead fd   putMVar m Read
 wid - forkIO $ threadWaitWrite fd  putMVar m Write
 r - takeMVar m
 killThread rid
 killThread wid
 return r



Initial testing seems promising. I haven't been able to provoke the 
leak during 15-20 minutes of testing.


I'll test more thoroughly during the weekend.

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-11 Thread Bardur Arantsson


Jeremy Shaw wrote:

On Wed, Feb 10, 2010 at 1:15 PM, Bardur Arantsson s...@scientician.netwrote:

I've also been contemplating some solutions, but I cannot see any solutions

to this problem which could reasonably be implemented outside of GHC itself.
GHC lacks a threadWaitError, so there's no way to detect the problem
except by timeout or polling. Solutions involving timeouts and polling are
bad in this case because they arbitrarily restrict the client connection
rate.

Cheers,



I believe solutions involving polling and timeouts may be the *only*
solution due to the way TCP works. There are two cases to consider here:



True, but my point was rather that a solution in the sendfile libary 
would incur an _extra_ timeout on top of the timeout which is handled by 
the OS. It's very hard to come up with a proper timeout here because 
apps will have different requirements depending on the expected 
connection rate, etc. This is what I see as unacceptable since it would 
have to be a completely arbitrary timeout -- there's no way for the 
application to specify a timeout to the sendfile library since the API 
doesn't permit it.


[--snip--]

Case #1 - Proper Disconnect

I believe that in case we are ok. select() may not wakeup due to the socket
being closed -- but something will eventually cause select() to wakeup, and
then next time through the loop, the call to select will fail with EBADF.
This will cause everyone to wakeup. We can test this case by writing a
client that purposely (and correctly) terminations the connection while
threadWaitWrite is blocking and see if that causes it to wakeup. To ensure
that the IOManager is eventually waking up, the server can have an IO thread
that just does, forever $ threadDelay (1*10^6)

Look here for more details:
http://darcs.haskell.org/packages/base/GHC/Conc.lhs



I don't have time to write a C test program right now. I'm actually not 
100% convinced that this case is *not* problematic, but my limited 
testing with well-behaved clients (wget, curl) hasn't turned up any 
problems so far.



Case #2 - Sudden Death

In this case, there is no way to tell if the client is still there with out
trying to send / recv data. A TCP connection is not a 'tangible' link. It is
just an agreement to send packets to/from certain ports with certain
sequence numbers. It's much closer to snail mail than a telephone call.

If you set the keepalive socket option, then the TCP layer will
automatically ping the connection to make sure it is still alive. However, I
believe the default time between keepalive packets is 2 hours, and can only
be changed on a system wide basis?

http://www.unixguide.net/network/socketfaq/2.8.shtml


There are some options you can set via setsockopt(), see man 7 tcp:

   tcp_keepalive_intvl(default: 75s)
   tcp_fin_timeout(default: 60s)

(The latter is the amount of time to wait for the final FIN before 
forcing a the socket to close.)


These can be set per-socket.



The other option is to try to send some data. There are at least two cases
that can happen here.


This is what I tried. The trouble here is that you have to force the 
thread doing threadWaitWrite to wake up periodically... and how do you 
decide how often? Too often and you're burning CPU doing nothing, too 
seldom and you're letting threads (and by implication 
used-but-really-disconnected-as-far-as-the-OS-is-concerned file 
descriptors) pile up. The overhead of mempcy (avoidance of which is 
sendfile's raison-d'être) is probably much less than the overhead of 
doing all this administration in userspace instead of just letting the 
kernel do its thing.


Even waking up very seldom (~1/s IIRC) incurred a lot of CPU overhead in 
my test case... but I suppose I could give it another try to see if I'd 
made some mistake in my code which caused it to use more CPU than necessary.




 1. the network cable is unplugged -- this is not an 'error'. The write
buffer will fill up and it will wait until it can send the data. If the
write buffer is full, it will either block or return EAGAIN depending on the
mode. Eventually, after 2 hours, it might give up.


I believe the socket is actually in non-blocking mode in my application. 
 I'm not putting it into non-blocking mode, so I'm guessing that the 
accept call is doing that -- or maybe it's just the default behavior 
of accept() on Linux. Converting a socket to a Handle (which is what the 
portable sendfile does) automatically puts it into blocking mode.


Actually, I think this whole issue could be avoided if the socket could 
just be forced into blocking mode. In that case, there would be no need 
to call threadWaitWrite: The native sendfile() call could never return 
EAGAIN (it would block instead), and so there'd be no need to call 
threadWaitWrite to avoid busy-waiting.



 2. the remote client has terminated the connection as far as it is
concerned but not notified the server -- when you try to send data it will
reject it, and send

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-11 Thread Bardur Arantsson


Thomas DuBuisson wrote:

Bardur Arantsson s...@scientician.net wrote:

...
  then do errno - getErrno
  if errno == eAGAIN
then do
   threadDelay 100
   sendfile out_fd in_fd poff bytes
else throwErrno Network.Socket.SendFile.Linux
 else return (fromIntegral sbytes)

That is, I removed the threadWaitWrite in favor of just adding a
threadDelay 100 when eAGAIN is encountered.

With this code, I cannot provoke the leak.

Unfortunately this isn't really a solution -- the CPU is pegged at
~50% when I do this and it's not exactly elegant to have a hardcoded
100 ms delay in there. :)


I don't think it matters wrt the desired final solution, but this is
NOT a 100 ms delay.  It is a 0.1 ms delay, which is less than a GHC
time slice and as such is basically a tight loop.  If you use a
reasonable value for the delay you will probably see the CPU being
almost completely idle.



Excellent, thanks. I was probably too tired or annoyed when I wrote that 
code. I sorta-kinda-knew I must have been doing *something* wrong :).


I'll retry with a more reasonable delay tomorrow.

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-11 Thread Bardur Arantsson


Jeremy Shaw wrote:


On Feb 11, 2010, at 1:57 PM, Bardur Arantsson wrote:



[--snip lots of technical info--]

Thanks for digging so much into this.

Just a couple of comments:



The whole point of the sendfile library is to use sendfile(), so not 
using sendfile() seems like the wrong solution.


Heh, well, presumably it could still use sendfile() only platforms where 
it can actually guarantee correctness :).




There is some evidence that when you are doing select() on a readfds, 
and the connection is closed, select() will indicate that the fds is 
ready to be read, but when you read it, you get 0-bytes. That indicates 
that a disconnect has happened. However, if you are only doing 
read()/recv(), I expect that only happens in the event of a proper 
disconnect, because if you are just listening for packets, there is no 
way to tell the difference between the sender just not saying anything, 
and the sender dying:


True, but the point here is that the OS has a built-in timeout mechanism 
(via keepalives) and *can* tell the program when that timeout has elapsed.


That's the timeout we're trying to get at instead of having to 
implement a new one.


Good point about the the readfds triggering when the client disconnects. 
I think that's what I've been seeing in all my other network-related 
code and I just misremembered the details. All my code is extremely 
likely to have been both reading and writing from (roughly) the same set 
of FDs at the same time.


If this method of detection is correct, then what we need is a 
threadWaitReadWrite, that will notify us if the socket can be read or 
written. The IO manager does not currently provide a function like 
that.. but we could fake it like this: (untested):


import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types

data RW = Read | Write

threadWaitReadWrite :: Fd - IO RW
threadWaitReadWrite fd =
  do m - newEmptyMVar
 rid - forkIO $ threadWaitRead fd   putMVar m Read
 wid - forkIO $ threadWaitWrite fd  putMVar m Write
 r - takeMVar m
 killThread rid
 killThread wid
 return r



I'll try to get the sendfile code to use this instead. AFAICT it 
shouldn't actually be necessary to peek on the read end of the socket 
to detect that something has gone wrong. We're guaranteed that 
sendfile() to a connection that's died (according to the OS, either due 
to proper disconnect or a timeout) will fail.


I might get a bit tricky to use this if the client is actually expecting 
to send proper data while the sendfile() is in progress -- if there's 
actual data to be read from the socket() then the naive replace 
threadWaitR by threadWaitRW will end up busy-waiting on EAGAIN since 
the socket() will be readable every time

threadWaitReadWrite gets called.

HOWEVER, that's not an issue in my particular scenario, so a simple 
relacement of threadWaitWrite by threadWaitReadWrite should do fine for 
testing purposes.


Of course, in the case where the client disconnects because someone 
turns off the power or pulls the ethernet cable, we have no way of 
knowing what is going on -- so there is still the possibility that dead 
connections will be left open for a long time.


True, but then it's (properly) left to the OS to decide and timeouts can 
be controlled via setsockopt -- as they should IMO.


I'll test tomorrow.

What I'll expect is that I'll still see a few dead threads lingering 
around for ~60 seconds (the OS-based timeout), but that I'll not see any
threads lingering indefinitely -- something which usually happens after 
a few hours of persistent use of my media server.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-10 Thread Bardur Arantsson


Jeremy Shaw wrote:

On Feb 9, 2010, at 6:47 PM, Thomas Hartman wrote:


Matt, have you seen this thread?

Jeremy, are you saying this a bug in the sendfile library on hackage,
or something underlying?


I'm saying that the behavior of the sendfile library is buggy. But it 
could be due to something underlying..


Either threadWaitWrite is buggy and should be fixed. Or threadWaitWrite 
is doing the right thing, and sendfile needs to be modified some how to 
account for the behavior. But I don't know which is the case or how to 
implement a solution to either option.


IMO, in the interests of correctness over speed, an interim release of 
sendfile which simply uses the portable code on Linux should be put 
out. The CPU overhead of the portable method doesn't matter that much 
for servers which aren't extremely busy.


I've also been contemplating some solutions, but I cannot see any 
solutions to this problem which could reasonably be implemented outside 
of GHC itself. GHC lacks a threadWaitError, so there's no way to 
detect the problem except by timeout or polling. Solutions involving 
timeouts and polling are bad in this case because they arbitrarily 
restrict the client connection rate.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-07 Thread Bardur Arantsson


Bardur Arantsson wrote:

Bardur Arantsson wrote:

(sorry about replying-to-self)

During yet another bout of debugging, I've added even more I am here 
instrumentation code to the SendFile code, and the culprit seems to be

  threadWaitWrite.



As Jeremy Shaw pointed out off-list, the symptoms are also consistent
with a thread that simply gets stuck in threadWaitWrite.

I've tried a couple of different solutions to this based on starting a
separate thread to enforce a timeout on threadWaitWrite (using throwTo).

It seems to work to prevent the file descriptor leak, but causes GHC
to segfault after a while. Probably some sort of other resource exhaustion
since my code is just an evil hack:

 killer :: MVar () - ThreadId - IO ()
 killer dontKill otherThread = do
threadDelay 5000
x - tryTakeMVar dontKill
case x of
   Just _ - putStrLn Killer thread expired
   Nothing - throwTo otherThread (Overflow)

where the relevant bit of sendfile reads:

mtid - myThreadId
dontKill - newEmptyMVar
forkIO $ killer dontKill mtid
threadWaitWrite out_fd
putMVar dontKill ()

So I'm basically creating a thread for every single threadWaitWrite operation
(which is a lot in this case).

Anyone got any ideas on a simpler way to handle this? Maybe I should just
report a bug for threadWaitWrite? IMO threadWaitWrite really should
throw some sort of IOException if the FD goes dead while it's waiting.

I suppose an alternative way to try to work around this would be by forcing the 
output
socket into blocking (as opposed to non-blocking) mode, but I can't figure out 
how to
do this on GHC 6.10.x -- I only see setNonBlockingFD which doesn't take a 
parameter
unlike its 6.12.x counterpart.

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-07 Thread Bardur Arantsson


Jeremy Shaw wrote:

It's not clear to me that this is actually a bug in threadWaitWrite.

I believe that under Linux, select() does not wakeup just because the file
descriptor was closed.


select() has the option of specifying an exceptfds FD_SET where I'd 
certainly _expect_ select() to flag an FD if it's closed. Annoyingly, 
the man page is not very specific about what an exception is, so it's 
hard to be sure.



(Under Windows, and possibly solaris/BSD/etc it
does). So this behavior might be consistent with normal Linux behavior.
However, it is clearly annoying that (a) the expected behavior is not
documented (b) the behavior might be different under Linux than other OSes.

In some sense it is correct -- if the file descriptor is closed, then we
certainly can't write more to it -- so threadWaitWrite need not wake up..
But that leaves us with the issue of needing  someway to be notified that
the file descriptor was closed so that we can clean up after ourselves..



True, it is perhaps technically not a bug, but it is certainly a 
misfeature since there is no easy way (at least AFAICT) to discover that 
something bad has happened for the file descriptor and act accordingly. 
AFAICT any solution would have to be based on a separate thread which 
either 1) checks the FD periodically somehow, or 2) simply lets the 
thread doing the threadWaitWrite time out after a set period of 
inactivity. Neither is very optimal.


Either way, I'd certainly expect the sendfile library to work around 
this somehow such that this situation doesn't occur. I'm just having a 
hard time thinking up a good solution :).


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-06 Thread Bardur Arantsson


Brandon S. Allbery KF8NH wrote:

On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:

[--snip--]


Broken pipe is normally handled as a signal, and is only mapped to an 
error if SIGPIPE is set to SIG_IGN.  I can well imagine that the SIGPIPE 
signal handler isn't closing resources properly; a workaround would be 
to use the System.Posix.Signals API to ignore SIGPIPE, but I don't know 
if that would work as a general solution (it would depend on what other 
uses of pipes/sockets exist).


It was a good idea, but it doesn't seem to help to add

installHandler sigPIPE Ignore (Just fullSignalSet)

to the main function. (Given the package name I assume 
System.Posix.Signals works similarly to regular old signals, i.e. 
globally per-process.)


This is really starting to drive me round the bend...

One further thing I've noticed: When compiling on my 64-bit machine,
ghc issues the following warnings:

Linux.hsc:41: warning: format ‘%d’ expects type ‘int’, but argument 3 
has type ‘long unsigned int’
Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument 3 
has type ‘long unsigned int’
Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument 3 
has type ‘long unsigned int’
Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument 3 
has type ‘long unsigned int’


Those lines are:

39: -- max num of bytes in one send
40: maxBytes :: Int64
41: maxBytes = fromIntegral (maxBound :: (#type ssize_t))

and

44: foreign import ccall unsafe sendfile64 c_sendfile
45:   :: Fd - Fd - Ptr (#type off_t) - (#type size_t) - IO (#type 
ssize_t)



This looks like a typical 32/64-bit problem, but normally I would expect 
any real run-time problems caused by a problematic conversion in the FFI 
to crash the whole process. Maybe I'm wrong about this...


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-06 Thread Bardur Arantsson


Felipe Lessa wrote:

On Sat, Feb 06, 2010 at 09:16:35AM +0100, Bardur Arantsson wrote:

Brandon S. Allbery KF8NH wrote:

On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:

[--snip--]

Broken pipe is normally handled as a signal, and is only mapped
to an error if SIGPIPE is set to SIG_IGN.  I can well imagine that
the SIGPIPE signal handler isn't closing resources properly; a
workaround would be to use the System.Posix.Signals API to ignore
SIGPIPE, but I don't know if that would work as a general solution
(it would depend on what other uses of pipes/sockets exist).

It was a good idea, but it doesn't seem to help to add

installHandler sigPIPE Ignore (Just fullSignalSet)

to the main function. (Given the package name I assume
System.Posix.Signals works similarly to regular old signals, i.e.
globally per-process.)

This is really starting to drive me round the bend...


Have you seen GHC ticket #1619?

http://hackage.haskell.org/trac/ghc/ticket/1619




I hadn't. I guess the conclusion is that SIG_PIPE is ignored by default anyway. 
So much
for that.

During yet another bout of debugging, I've added even more I am here 
instrumentation
code to the SendFile code, and the culprit seems to be threadWaitWrite. Here's 
the bit
of code I've modified:

 sendfile :: Fd - Fd - Ptr Int64 - Int64 - IO Int64
 sendfile out_fd in_fd poff bytes = do
 putStrLn PRE-threadWaitWrite
 threadWaitWrite out_fd
 putStrLn AFTER threadWaitWrite
 sbytes - c_sendfile out_fd in_fd poff (fromIntegral bytes)
 putStrLn $ AFTER c_sendfile; result was:  ++ (show sbytes)
 if sbytes = -1
   then do errno - getErrno
   if errno == eAGAIN
 then sendfile out_fd in_fd poff bytes
 else throwErrno Network.Socket.SendFile.Linux
   else return (fromIntegral sbytes)

This is the output when a file descriptor is lost:

---
AFTER sendfile: sbytes=27512
DIFFERENCE: 627264520
remaining=627264520, bytes=627264520
PRE-threadWaitWrite
Got request for CONTENT for objectId=1700,f2150400
Serving file 'X'...
Sending 625838080 bytes...
in_fd=13
---

So I have to conclude that threadWaitWrite is doing *something* which causes
the thread to die when the PS3 kills the connection.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-06 Thread Bardur Arantsson


Bardur Arantsson wrote:

(sorry about replying-to-self)

During yet another bout of debugging, I've added even more I am here 
instrumentation code to the SendFile code, and the culprit seems to be

 threadWaitWrite.

I think I've pretty much confirmed this.

I've changed the code again. This time to:

 sendfile :: Fd - Fd - Ptr Int64 - Int64 - IO Int64
 sendfile out_fd in_fd poff bytes = do
 putStrLn PRE-threadWaitWrite
 -- threadWaitWrite out_fd
 -- putStrLn AFTER threadWaitWrite
 sbytes - c_sendfile out_fd in_fd poff (fromIntegral bytes)
 putStrLn $ AFTER c_sendfile; result was:  ++ (show sbytes)
 if sbytes = -1
   then do errno - getErrno
   if errno == eAGAIN
 then do
threadDelay 100
sendfile out_fd in_fd poff bytes
 else throwErrno Network.Socket.SendFile.Linux
  else return (fromIntegral sbytes)

That is, I removed the threadWaitWrite in favor of just adding a
threadDelay 100 when eAGAIN is encountered.

With this code, I cannot provoke the leak.

Unfortunately this isn't really a solution -- the CPU is pegged at
~50% when I do this and it's not exactly elegant to have a hardcoded
100 ms delay in there. :)

I'm hoping that someone who understands the internals of GHC can chime
in here with some kind of explanation as to if/why/how threadWaitWrite can
fail in this way.

Anyone?

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-05 Thread Bardur Arantsson


Jeremy Shaw wrote:

Actually,

We should start by testing if native sendfile leaks file descriptors even
when the whole file is sent. We have a test suite, but I am not sure if it
tests for file handle leaking...



I should have posted this earlier, but the exact message I'm seeing in 
the case where the Bad Client disconnects is this:


   hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe)

Oddly, I haven't been able to reproduce this using a wget client with a 
^C during transfer. When I disconnect wget with ^C or pkill wget or 
even pkill -9 wget, I get this message:


  hums: Network.Socket.SendFile.Linux: resource vanished (Connection 
reset by peer)


(and no leak, as observed by lsof | grep hums).

So there appears to be some vital difference between the handling of the 
two cases.


Another observation which may be useful:

Before the sendfile' API change (Handle - FilePath) in sendfile-0.6.x, 
my code used withFile to open the file and to ensure that it was 
closed. So it seems that withBinaryFile *should* also be fine. Unless 
the Broken pipe error somehow escapes the scope without causing a close.


I don't have time to dig more right now, but I'll try to see if I can 
find out more later.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-05 Thread Bardur Arantsson


Thomas Hartman wrote:

Do you have a test script to reproduce the behavior?



Unfortunately not, but the behavior *is* 100% reproducible with
my PS3 client. The production of a leaked FD appears to require a
particularly abrupt disconnect (see my other reply in this thread), so
you're probably safe in most cases.

Cheers,


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

2010-02-05 Thread Bardur Arantsson

I desperation, I've tried to instrument a couple of the functions in 
SendFile:


 sendFile'' :: Socket - Handle - Integer - Integer - IO ()
 sendFile'' outs inp off count =
 do let out_fd = Fd (fdSocket outs)
in_fd - handleToFd inp
putStrLn (in_fd= ++ show in_fd)
finally (wrapSendFile' _sendFile out_fd in_fd off count)
(do
  putStrLn (SENDFILE DONE  ++ show in_fd)
)

 sendFile' :: Socket - FilePath - Integer - Integer - IO ()
 sendFile' outs infp offset count =
 bracket
(openBinaryFile infp ReadMode)
(\h - do
  putStrLn CLOSING FILE!
  hClose h
  putStrLn FILE CLOSED!)
(\inp - sendFile'' outs inp offset count)

(Yes, this made me feel dirty.)

Here's the resulting output from around when the file descriptor gets lost:

---
Serving file 'X'...
Sending 674465792 bytes... 

in_fd=7 

SENDFILE DONE 7 

CLOSING FILE! 

FILE CLOSED! 

hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe) 

Got request for CONTENT for objectId=1700,f2150400 


Serving file 'X'...
Sending 672892928 bytes... 

in_fd=7 

SENDFILE DONE 7 

CLOSING FILE! 

FILE CLOSED! 

hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe) 

Got request for CONTENT for objectId=1700,f2150400 


Serving file 'X'...
Sending 670140416 bytes... 

in_fd=7 



*- What happened here?

Got request for CONTENT for objectId=1700,f2150400 


Serving file 'X'...
Sending 667256832 bytes... 

in_fd=9 

SENDFILE DONE 9 

CLOSING FILE! 

FILE CLOSED! 

hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe) 

Got request for CONTENT for objectId=1700,f2150400 


Serving file 'X'...
Sending 665028608 bytes... 

in_fd=9 

SENDFILE DONE 9 

CLOSING FILE! 

FILE CLOSED! 

hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe) 

Got request for CONTENT for objectId=1700,f2150400 


Serving file 'X'...
---


Anyone got any clues as to what might cause the behavior show at the mark?

The only idea I have is that *something* in the SendFile library kills 
the thread completely (or somehow evades finally), but I have no idea 
what it might be.


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] sendfile leaking descriptors on Linux?

2010-02-04 Thread Bardur Arantsson


Hi all,

I've been using the sendfile package off Hackage, but it seems to be 
leaking file descriptors when using the Linux-native build.


What's happening in my specific case is the following:

   1) client requests a range of a file
   2) server starts sending the range
   3) client disconnects before receiving the whole file

This happens over and over with the client requesting different ranges 
of the file (so the client does make progress).


If I use the portable build of the sendfile package, everything works 
fine for hours and hours of this happening.


If I use the Linux-native build of the sendfile package, the server
will eventually run out of file descriptors. According to lsof the 
files that are being kept open are the data files being sent in 2) above.


This is on GHC 6.10.x (Ubuntu).

Is anyone else seeing this? Anyone got any idea what's going on?

Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: PROPOSAL: Web application interface

2010-01-24 Thread Bardur Arantsson


Michael Snoyman wrote:

[--snip--]

Next, I have made the ResponseBodyClass typeclass specifically with the goal
of allowing optimizations for lazy bytestrings and sending files. The former
seems far-fetched; the latter provides the ability to use a sendfile system
call instead of copying the file data into memory. However, in the presence
of gzip encoding, how useful is this optimization?

[--snip--]

I'm hoping that the Web bit in your project title doesn't literally 
mean that WAI is meant to be restricted to solely serving content to 
browsers. With that caveat in mind:


For non-WWW HTTP servers it can be extremely useful to have sendfile. An 
example is my Haskell UPnP Media Server (hums) application. It's sending 
huge files (AVIs, MP4s, etc.) over the network and since these files are 
already compressed as much as they're ever going to be, gzip would be 
useless. The CPU load of my hums server went from 2-5% to 0% when 
streaming files just from switching from a Haskell I/O based solution to 
proper sendfile.


Lack of proper support for sendfile() was indeed one of the reasons that 
I chose to roll my own HTTP server for hums. I should note that this was 
quite a while ago and I haven't really gone back to reevaluate that 
choice -- there's too many HTTP stacks to choose from right now and I 
don't have the time to properly evaluate them all.


For this type of server, response *streaming* is also extremely 
important for those cases where you cannot use sendfile, so I'd hate to 
see a standard WAI interface preclude that. (No, lazy I/O is NOT an 
option -- the HTTP clients in a typical UPnP media client behave so 
badly that you'll run out of file descriptors in no time. Trust me, I've 
tried.)


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Is it just me... or is cabal/hackage a little broken?

2009-12-22 Thread Bardur Arantsson


Hi all,

Sorry about the inflammatory title, but I just got this message from an 
uploaded package (hums):


  Warning: This package indirectly depends on multiple versions of the
  same package. This is highly likely to cause a compile failure.

The thing is, I got the same message while trying to compile locally and 
it turned out that all I had to do was to


   $ cabal install PKG-X

on all the packages that cabal complained about. So why doesn't hackage 
do this automagically when I upload a package? How am I supposed to know 
which versions of my package's dependencies (or their dependencies) are 
the most recently compiled by hackage?


For the record: I did do a Check package upload first. It didn't complain.

Is this an intractable problem? Am I being overly demanding (probably)?

Cheers,

Bárður




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Haskell not ready for Foo [was: Re: Hypothetical Haskell job in New York]

2009-01-08 Thread Bardur Arantsson


Manlio Perillo wrote:

John Goerzen ha scritto:

On Thu, Jan 08, 2009 at 10:36:32AM -0700, John A. De Goes wrote:
[...]

On the other hand, I see nothing in Haskell that would prevent its use
for any of your purposes.  There are numerous high-level web
infrastructures already.  Perhaps they are more or less suited to your
needs, but that's a library issue, not a language issue.  



The question is not about Haskell language.
I think that Haskell is far better than Erlang, and in fact I'm studying 
Haskell and not Erlang; and one of the reason I choosed Haskell is for 
its support to concurrency.


The problem, IMHO, is with the availability of solid, production ready 
servers implemented in Haskell, that can be used as case study.


The major web framework in Haskell is HAppS, if I'm correct, and yet in 
the HAppS code I see some things that make me worry about the robustess 
of the code.



[--snip--]

Indeed. I've been looking for a Haskell HTTP server implementation that 
can actually handle file serving using strictly limited memory (for a 
simple UPnP server, as of yet unreleased) and that also doesn't leak 
handles like a sieve, but I haven't found anything yet. I don't know, 
maybe my hackage-foo is lacking. In the end I just rolled my own 
implementation using the HTTP package for parsing requests and doing all 
the socket I/O myself using low-level primitives. It seemed to be the 
only way to guarantee reasonable resource usage while serving 
multi-gigabyte files to fickle HTTP clients that like to drop 
connections willy-nilly.


Don't get me wrong -- the socket support is pretty decent, but there are 
also some weird idiosyncrasies, for example requiring that the PortNum 
is specified in network byte order and lacking a function to convert 
host-network byte order (hton).


Oleg's Iteratee does look very interesting though. Maybe I'll have a go 
at trying to use his ideas in my UPnP server.


Cheers,

Bardur Arantsson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Haskell not ready for Foo [was: Re: Hypothetical Haskell job in New York]

2009-01-08 Thread Bardur Arantsson


John Goerzen wrote:

On Thu, Jan 08, 2009 at 10:37:55PM +0100, Bardur Arantsson wrote:
Don't get me wrong -- the socket support is pretty decent, but there are  
also some weird idiosyncrasies, for example requiring that the PortNum  
is specified in network byte order and lacking a function to convert  
host-network byte order (hton).


Look at Haddock for PortNumber:

newtype PortNumber
  Constructors
PortNum Word16  


  Instances

  Enum PortNumber
  Eq PortNumber
  Integral PortNumber
  Num PortNumber
  Ord PortNumber
  Real PortNumber
  Show PortNumber
  Typeable PortNumber
  Storable PortNumber

Try it in ghci:

Prelude Network.Socket 15 :: PortNumber
15
Prelude Network.Socket PortNum 15
3840
Prelude Network.Socket (fromIntegral (15::Int))::PortNumber
15

So, in essence, there are *many* functions that let you do this.  You
should not be needing to construct PortNum by hand.


Thanks. For some reason I hadn't thought to use

   (fromIntegral x)::PortNumber

I guess I got stuck on the idea of constructing a PortNum directly and 
didn't think beyond that. (Maybe PortNum should really be an abstract 
type to force indirect construction...?)


I guess the API isn't all that idiosyncratic after all :).

Cheers,

Bardur Arantsson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Hypothetical Haskell job in New York

2009-01-08 Thread Bardur Arantsson


John Goerzen wrote:

On Thu, Jan 08, 2009 at 09:46:36PM +0100, Manlio Perillo wrote:

I'm speaking about servers, not clients.




Personally, I only know http://hpaste.org/, based on
Server: HAppS/0.8.4


Take a look at Hackage.  There are quite a few other Haskell web
frameworks as well: everything from the low-level FastCGI to
higher-level HSP and WASH.



FastCGI is not a HTTP server. WASH seems so include one, but the latest 
version (Wash and go) seems to be from mid-2007 (tested with GHC 6.6 
as the web page states), unless of course I'm looking at the wrong page. 
That doesn't exactly inspire a lot of confidence.


Now, if you're talking about using, say, Apache + FastCGI then you'll 
probably have something pretty robust, but I don't think that counts as 
a Haskell server.


Generally my experience has been that most of the Haskell server stuff 
hasn't been very mature.


And about HAppS, I'm not an Haskell expert, but reading the source I see  
that static files are server (in the HTTP server) using  
Data.ByteString.Lazy's hGetContents


Is this ok?


In what respect?  The fact that something uses
ByteString.Lazy.hGetContents doesn't imply a problem to me.  It's a
useful function.  It can be used properly, or not, just as while or
read() in C can be.


It's a great way to introduce unavoidable handle leaks, that's for sure.

Cheers,

Bardur Arantsson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

77 matches

Mail list logo