Re: [Haskell-cafe] Difference between Lazy ByteStrings and Strings in alex

2007-02-14 Thread Björn Bringert
On a related topic, I think Duncan Coutts and Lennart Kolmodin have 
worked on adding ByteString support to Alex. It seems to be available in 
the current darcs version of Alex. You many want to check with them for 
more details.


/Björn

Jefferson Heard wrote:
It was suggested that I might derive some performance benefit from using lazy 
bytestrings in my tokenizer instead of regular strings.  Here's the code that 
I've tried.  Note that I've hacked the basic wrapper code in the Lazy 
version, so the code should be all but the same.  The only thing I had to do 
out of the ordinary was write my own 'take' function instead of using the 
substring function provided by Data.Lazy.ByteString.Char8.  The take function 
I used was derived from the one GHC uses in GHC.List and produces about the 
same code.  

The non-lazy version runs in 38 seconds on a 211MB file versus the lazy 
versions 41 seconds.  That of course doesn't seem like that much, and in the 
non-lazy case, I have to break the input up into multiple files, whereas I 
don't have to in the lazy version -- this does not take any extra time.  The 
seconds do add up to a couple of hours for me, though once I'm done, and so 
I'd like to understand why, when the consensus was that Data.ByteString.Lazy 
might give me better performance in the end, it doesn't do so here.  

I am running GHC 2.6 now, and am using -O3 as my optimization parameter.  I'm 
profiling the code now, but was wondering if there was any insight...


-- Jeff 


Non-lazy version

{
module Main
where

import qualified FileReader

}

%wrapper basic

$letter = [a-zA-Z]
$digit = 0-9
$alphanum = [a-zA-Z0-9]
$punct = [\! \@ \# \$ \% \^ \ \* \( \) \_ \- \+ \= \{ \[ \} \] \\ \| \; \: \' 
\ \, \. \? \/ \` \~]

$dec = \.
$posneg = [\- \+]

@date1 = jan($punct|uary)?\ $digit{1,2}(\,\ $digit{2,4})?
   | feb($punct|ruary)?\ $digit{1,2}(\,\ $digit{2,4})?
   | mar($punct|ch)?\ $digit{1,2}(\,\ $digit{2,4})?
   | apr($punct|il)?\ $digit{1,2}(\,\ $digit{2,4})?
   | may?\ $digit{1,2}(\,\ $digit{2,4})?
   | jun($punct|e)?\ $digit{1,2}(\,\ $digit{2,4})?
   | jul($punct|y)?\ $digit{1,2}(\,\ $digit{2,4})?
   | aug($punct|ust)?\ $digit{1,2}(\,\ $digit{2,4})?
   | sep($punct|tember)?\ $digit{1,2}(\,\ $digit{2,4})?
   | sept($punct)?\ $digit{1,2}(\,\ $digit{2,4})?
   | oct($punct|ober)?\ $digit{1,2}(\,\ $digit{2,4})?
   | nov($punct|ember)?\ $digit{1,2}(\,\ $digit{2,4})?
   | dec($punct|ember)?\ $digit{1,2}(\,\ $digit{2,4})?

@date2 = $digit{1,2} $punct $digit{1,2} $punct $digit{2,4}

@time = $digit{1,2} \: $digit{2} (am|pm)?

@word = $alphanum+

@number = $posneg? $digit+ 
| $posneg? $digit+ $dec $digit+

| $posneg? $digit+ (\,$digit{3})+
| $posneg? $digit? (\,$digit{3})+ $dec $digit+

$white = [\t\r\n\v\f\ ]

@doc = \ DOC \
@tag = \ $alphanum+ \
 | \\/ $alphanum+ \

tokens :- 
  @doc{ \s -  }

  @tag;
  $white+ ; 
  @time   { \s - s }
  @number { \s - s } 
  @word   { \s - s }
  $punct  ; 
  .   ;


{

printCount c [] = print c
printCount c (l:ls) = if l ==  then printCount (c+1) ls else printCount c ls

main = do
file - readFile trecfile1 
printCount 0 (alexScanTokens file) 
 
}


-- 

Version depending on ByteString.Lazy -- note that the grammar is the same, so 
it has been omitted

-- 


... grammar ...

{
type AlexInput = (Char, -- previous char
  B.ByteString)   -- current input string

takebytes :: Int - B.ByteString - String
takebytes (0) _ =  
takebytes n s = c : takebytes (n-1) cs
where c = B.index s 0
  cs = B.drop 1 s

alexGetChar :: AlexInput - Maybe (Char,AlexInput)
alexGetChar (_, bytestring) 
| bytestring == B.empty = Nothing

| otherwise = Just (c , (c,cs))
where c = B.index bytestring 0
  cs = B.drop 1 bytestring

alexInputPrevChar :: AlexInput - Char
alexInputPrevChar (c,_) = c

alexScanTokens :: B.ByteString - [String]
alexScanTokens str = go ('\n',str)
  where go inp@(_,str) =
  case alexScan inp 0 of
AlexToken inp' len act - act (takebytes len str) : go inp'
AlexSkip  inp' len - go inp'
AlexEOF - []

AlexError _ - error lexical error





printCount :: Int - [String] - IO ()
printCount c [] = print c
printCount c (l:ls) = if l ==  then printCount (c+1) ls else printCount c ls

main = do
file - B.readFile trecfile1 
printCount 0 (alexScanTokens file) 
 
}

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list

Re: [Haskell-cafe] Difference between Lazy ByteStrings and Strings in alex

2007-02-14 Thread Duncan Coutts
On Tue, 2007-02-13 at 22:43 -0500, Jefferson Heard wrote:
 It was suggested that I might derive some performance benefit from using lazy 
 bytestrings in my tokenizer instead of regular strings.  Here's the code that 
 I've tried.  Note that I've hacked the basic wrapper code in the Lazy 
 version, so the code should be all but the same.  The only thing I had to do 
 out of the ordinary was write my own 'take' function instead of using the 
 substring function provided by Data.Lazy.ByteString.Char8.  The take function 
 I used was derived from the one GHC uses in GHC.List and produces about the 
 same code.

If you use the latest darcs version of alex and the basic-bytestring
wrapper then you don't have to write any of your own take functions.

Actually, lazy ByteString are still not as optimised as I would like.
There are still too many indirections. That's something I'm working on
at the moment.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Network.CGI.Compat.pwrapper

2007-02-14 Thread Björn Bringert

Gracjan Polak wrote:

Bjorn Bringert bringert at cs.chalmers.se writes:

Is there a description what is a *CGI* protocol?

Here you go: http://hoohoo.ncsa.uiuc.edu/cgi/interface.html



I should be more clear: what kind of data does pwrapper expect? Somewhere in the
middle it needs two handles: one to write and one to read which seem to be
equivalent to stdin/stdout. But what about environment? How is it transfered, as
someone ale pointed out pwrapper runs on different machine?


I think your best bet is to read the code, it's not many lines. IIRC, 
pwrapper takes the environment variables from the local environment, 
which is rather useless. If you want a protocol for talking CGI over a 
socket, FastCGI does just that.


/Björn
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Summer of Code

2007-02-14 Thread Bulat Ziganshin
Hello Donald,

Wednesday, February 14, 2007, 1:51:26 AM, you wrote:
 7 were successful, 2 were unsuccessful.

can you please name successful projects and their download pages?


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to solve this problem?It's quite easy in PHP.

2007-02-14 Thread Neil Mitchell

Hi

Eeek, a solution that does monadic maps and require's rank 2 types!

arr = [('a',1), ('b',2), ('c',3)]
showAll = lines (map showItem arr)
showItem (a,n) = a :  =  ++ show n
main = putStr showAll

I've broken this up a bit more than usual - most people would probably
just put showAll inside main, but this separates out the concepts. arr
is the data, which you could construct with a zip if that really is
all there is to it. showItem shows a single item, showAll shows them
all, and main just prints out the information.

And this solution is Haskell 98, anything that uses printf is Haskell' only.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to solve this problem?It's quite easy in PHP.

2007-02-14 Thread Donald Bruce Stewart
ndmitchell:
 Hi
 
 Eeek, a solution that does monadic maps and require's rank 2 types!
 
 arr = [('a',1), ('b',2), ('c',3)]
 showAll = lines (map showItem arr)
 showItem (a,n) = a :  =  ++ show n
 main = putStr showAll
 
 I've broken this up a bit more than usual - most people would probably
 just put showAll inside main, but this separates out the concepts. arr
 is the data, which you could construct with a zip if that really is
 all there is to it. showItem shows a single item, showAll shows them
 all, and main just prints out the information.
 
 And this solution is Haskell 98, anything that uses printf is Haskell' only.
 
 Thanks
 
 Neil

$ hugs +98
__   __ __  __     ___  _
||   || ||  || ||  || ||__  Hugs 98: Based on the Haskell 98 standard
||___|| ||__|| ||__||  __|| Copyright (c) 1994-2005
||---|| ___||   World Wide Web: http://haskell.org/hugs
||   || Report bugs to: hugs-bugs@haskell.org
||   || Version: March 2005 _

Haskell 98 mode: Restart with command line option -98 to enable extensions

Type :? for help
Hugs.Base :l Text.Printf
Text.Printf printf %d (1::Int) :: String
1



Fix YHC!

:-)

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to solve this problem?It's quite easy in PHP.

2007-02-14 Thread Neil Mitchell

Hi Don,


Type :? for help
Hugs.Base :l Text.Printf
Text.Printf printf %d (1::Int) :: String
1


My bad - sorry, too many presentations/papers were people encoded
printf using multi-ranked-generalised-associated types :)


Fix Yhc!


Fair point, should be as simple as compiling that file...

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to solve this problem?It's quite easy in PHP.

2007-02-14 Thread Lennart Augustsson

Using printf doesn't need Haskell'.  I wrote to require only Haskell98.

-- Lennart


On Feb 14, 2007, at 11:36 , Neil Mitchell wrote:


Hi

Eeek, a solution that does monadic maps and require's rank 2 types!

arr = [('a',1), ('b',2), ('c',3)]
showAll = lines (map showItem arr)
showItem (a,n) = a :  =  ++ show n
main = putStr showAll

I've broken this up a bit more than usual - most people would probably
just put showAll inside main, but this separates out the concepts. arr
is the data, which you could construct with a zip if that really is
all there is to it. showItem shows a single item, showAll shows them
all, and main just prints out the information.

And this solution is Haskell 98, anything that uses printf is  
Haskell' only.


Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A new code search engine

2007-02-14 Thread Conrad Parker

On 14/02/07, Stephane Bortzmeyer [EMAIL PROTECTED] wrote:

http://www.krugle.com/


Nice :-)


Unlike Google, you can specify Haskell as a language.


Google CodeSearch is pretty handy though:

http://www.google.com/codesearch?q=lang%3Ahaskell

it seems to return code with good relevence, and can look for specific
licenses, but doesn't search on the structure of the code.

Sometimes I find these tools useful for finding examples of how people
actually use particular functions in the wild :-)

cheers,

Conrad.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Calendar Dates before the epoch

2007-02-14 Thread jim burton



Bjorn Bringert-2 wrote:
 
 
 Use the time package (Data.Time.*). time-1.0 is in GHC 6.6 extralibs, 
 and available from Hackage 
 (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/time-1.0) 
 and the development version lives at
 http://darcs.haskell.org/packages/time/
 
 
Thanks Björn

Jim

-- 
View this message in context: 
http://www.nabble.com/Calendar-Dates-before-the-epoch-tf3221685.html#a8968473
Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] LugRadio talk

2007-02-14 Thread Dougal Stanton
So which one of you wonderful people volunteered to give a talk on 
Haskell at the coming LugRadio Live? I'm afraid I didn't catch the name, 
though it might have been a Magnus something?


It might be enough to make me go if I knew there would be some funky 
Haskelling going on.


Cheers,

D.

--
Dougal Stanton
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A new code search engine

2007-02-14 Thread Adam Peacock

On 2/14/07, Stephane Bortzmeyer [EMAIL PROTECTED] wrote:

http://www.krugle.com/

Unlike Google, you can specify Haskell as a language.


It is true that you can't directly specify the programming language
with Google. But you can specify the filetype, i.e. hs or lhs, with
Google.

To do this, just add `filetype:hs` to you search.

And according to my initial tests, Google still wins.

Adam
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] LugRadio talk

2007-02-14 Thread Lennart Kolmodin

Dougal Stanton wrote:
So which one of you wonderful people volunteered to give a talk on 
Haskell at the coming LugRadio Live? I'm afraid I didn't catch the name, 
though it might have been a Magnus something?


It might be enough to make me go if I knew there would be some funky 
Haskelling going on.




I heard it as Magnus too, a Swedish name.

Sounds interesting indeed, LugRadio Live 2007, 7th-8th July 2007, The 
Lighthouse, Fryer Street, Wolverhampton, UK.


Cheers,
  Lennart Kolmodin

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Where can I find XmlRpc package?

2007-02-14 Thread keepbal

http://www.cs.chalmers.se/~bringert/darcs/blob/Makefile

BlobXmlRpc: GHCFLAGS += -package XmlRpc

I can't find XmlRpc,so I use haxr instead,but it doesn't work.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Where can I find XmlRpc package?

2007-02-14 Thread Bjorn Bringert

On Feb 14, 2007, at 22:50 , keepbal wrote:


http://www.cs.chalmers.se/~bringert/darcs/blob/Makefile

BlobXmlRpc: GHCFLAGS += -package XmlRpc

I can't find XmlRpc,so I use haxr instead,but it doesn't work.


haxr is the new name for the XmlRpc package, so changing -package  
XmlRpc to -package haxr should work. If it doesn't, please include  
the information needed to figure out what's happening, for example  
what you tried and what happened (error messages etc.).


Blob is quite old by the way. I can hardly remember what it does any  
more.


/Björn___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Where can I find XmlRpc package?

2007-02-14 Thread keepbal

BlobXmlRpc.hs:36:19: parse error on input `$'
make: *** [BlobXmlRpc] Error 1


2007/2/15, Bjorn Bringert [EMAIL PROTECTED]:


On Feb 14, 2007, at 22:50 , keepbal wrote:

 http://www.cs.chalmers.se/~bringert/darcs/blob/Makefile

 BlobXmlRpc: GHCFLAGS += -package XmlRpc

 I can't find XmlRpc,so I use haxr instead,but it doesn't work.

haxr is the new name for the XmlRpc package, so changing -package
XmlRpc to -package haxr should work. If it doesn't, please include
the information needed to figure out what's happening, for example
what you tried and what happened (error messages etc.).

Blob is quite old by the way. I can hardly remember what it does any
more.

/Björn
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Programming an Evaluation Context

2007-02-14 Thread Klaus Ostermann

Hi there,

in structural operational semantics, an evaluation context is often used to 
decompose an expression into a redex and its context. In a formal semantics on 
paper, an expression can just be pattern matched over the grammar of an 
evaluation context. If one wants to implement such a semantics in the form of an 
interpreter, I could not come up with a similarly nice solution. I have declared 
two separate data types (one for expressions, and one for evaluation contexts) 
and explicit functions to convert an expression into a (evaluation context, 
redex) pair.


For example, I could have

data Expr = Val Int | Plus Expr Expr

and

data Ctx = Hole | CPlusl Ctx Expr | CPlusr Int Expr

Are there any tricks to mimick more closely what is going on in the formal 
semantics?


Klaus
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.ByteString.Lazy.Char8 and finding substrings

2007-02-14 Thread Donald Bruce Stewart
magnus:
 I'm curious, why doesn't Data.ByteString.Lazy.Char8 have the functions
 for searching for substrings that Data.ByteString.Char8 has (isPrefixOf,
 isSuffixOf, isSubstringOf, findSubstring and findSubstrings)?
 

Sorry for the delay.

The reason they're missing is that no one implemented it.
Would be nice to have though.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Programming an Evaluation Context

2007-02-14 Thread Jim Apple

On 2/14/07, Klaus Ostermann [EMAIL PROTECTED] wrote:

in structural operational semantics, an evaluation context is often used to
decompose an expression into a redex and its context.


Have you seen

http://citeseer.ist.psu.edu/mcbride01derivative.html

The Derivative of a Regular Type is its Type of One-Hole Contexts

Jim
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe