[Haskell-cafe] CFP: APLAS 2011, Kenting, Taiwan

2011-05-06 Thread Shin-Cheng Mu
) 
Radha Jagadeesan (Depaul U., USA) 
Naoki Kobayashi (Tohoku U., Japan) 
Daniel Kroning (U. Oxford, UK) 
Rupak Majumdar (MPI-SWS, Germany) 
Andrzej Murawski (U. Leicester, UK) 
Paulo Oliva (Queen Mary U. London, UK) 
Doron Peled (Bar Ilan U., Israel) 
Sukyoung Ryu (KAIST, Korea) 
Sriram Sankaranarayanan (U. Colorado, USA) 
Armando Solar-Lezama (MIT, USA) 
Sam Staton (U. Cambridge, UK)
Viktor Vafeiadis (MPI-SWS, Germany)
Kapil Vaswani (MSR, India) 
Martin Vechev (IBM, USA)
Peng Wu (IBM, USA) 
Hongseok Yang (Queen Mary U. London, UK) 
Pen-Chung Yew (Academia Sinica, Taiwan) 

ORGANIZERS:
Mike Dodds, U. Cambridge (Poster chair)
Shin-Cheng Mu, Academia Sinica, Taiwan (Local arrangement chair)
Noam Rinetzky, Queen Mary U. London (Publicity chair)
Yih-Kuen Tsay, National Taiwan University (Finance chair)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] 2nd CFP: ACM SIGPLAN 7th Workshop on Generic Programming (WGP 2011)

2011-05-05 Thread Shin-Cheng Mu
==
 CALL FOR PAPERS

WGP 2011

   7th ACM SIGPLAN Workshop on Generic Programming
  Tokyo, Japan
   Sunday, September 18th, 2011

 http://flolac.iis.sinica.edu.tw/wgp11/

Collocated with the International Conference on Functional Programming
(ICFP 2011)
==

Goals of the workshop
-

Generic programming is about making programs more adaptable by making
them more general. Generic programs often embody non-traditional kinds
of polymorphism; ordinary programs are obtained from them by suitably
instantiating their parameters. In contrast with normal programs, the
parameters of a generic program are often quite rich in structure; for
example they may be other programs, types or type constructors, class
hierarchies, or even programming paradigms.

Generic programming techniques have always been of interest, both to
practitioners and to theoreticians, and, for at least 20 years,
generic programming techniques have been a specific focus of research
in the functional and object-oriented programming communities. Generic
programming has gradually spread to more and more mainstream
languages, and today is widely used in industry. This workshop brings
together leading researchers and practitioners in generic programming
from around the world, and features papers capturing the state of the
art in this important area.

We welcome contributions on all aspects, theoretical as well as
practical, of

   * generic programming,
   * programming with (C++) concepts,
   * meta-programming,
   * programming with type classes,
   * programming with modules,
   * programming with dependent types,
   * polytypic programming,
   * adaptive object-oriented programming,
   * component-based programming,
   * strategic programming,
   * aspect-oriented programming,
   * family polymorphism,
   * object-oriented generic programming,
   * and so on.

Organizers
--

Co-Chair
 Jaakko Järvi, Texas AM University, USA
Co-Chair
 Shin-Cheng Mu, Academia Sinica, Taiwan

Programme Committee
---

Dave Abrahams, BoostPro Computing, USA
Magne Haveraaen, Universitetet i Bergen, Norway
Akimasa Morihata, Tohoku University, Japan
Pablo Nogueira, Universidad Politécnica de Madrid, Spain
Ulf Norell, Chalmers University of Technology and University of Gothenberg, 
Sweden
Ross Paterson, City University London, UK
Rinus Plasmeijer, Radboud University Nijmegen, The Netherlands
Sibylle Schupp, Technische Universität Hamburg-Harburg, Germany
Andrew Sutton, Kent State University, USA
Tarmo Uustalu, Institute of Cybernetics, Estonia

Important Information
-

We plan to have formal proceedings, published by the ACM.

Submission details
Deadline for submission: Monday2011-06-06
Notification of acceptance:  Tuesday   2011-07-01
Final submission due:Monday2011-07-25
Workshop:Sunday2011-09-18

Authors should submit papers, in postscript or PDF format, formatted
for A4 paper, to the WGP11 EasyChair instance by the above deadline.
The length should be restricted to 12 pages in standard
(two-column, 9pt) ACM format. Accepted papers are published by the ACM
and will additionally appear in the ACM digital library.

History of the Workshop on Generic Programming
--

This year:

 * Tokyo, Japan 2011 (affiliated with ICFP11)

Earlier Workshops on Generic Programming have been held in

 * Baltimore, Maryland, US 2010 (affiliated with ICFP10)
 * Edinburgh, UK 2009 (affiliated with ICFP09)
 * Victoria, BC, Canada 2008 (affiliated with ICFP),
 * Portland 2006 (affiliated with ICFP),
 * Ponte de Lima 2000 (affiliated with MPC),
 * Marstrand 1998 (affiliated with MPC).

Furthermore, there were a few informal workshops

 * Utrecht 2005 (informal workshop),
 * Dagstuhl 2002 (IFIP WG2.1 Working Conference),
 * Nottingham 2001 (informal workshop),

There were also (closely related) DGP workshops in Oxford (June
3-4 2004), and a Spring School on DGP in Nottingham (April 24-27
2006, which had a half-day workshop attached).
Additional information:

The WGP steering committee consists of J. Gibbons, R. Hinze, P. Jansson,
J. Järvi, J. Jeuring, B. Oliveira, S. Schupp, and M. Zalewski

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] CFP: APLAS 2011, Kenting, Taiwan

2011-04-11 Thread Shin-Cheng Mu
) 
Radha Jagadeesan (Depaul U., USA) 
Naoki Kobayashi (Tohoku U., Japan) 
Daniel Kroning (U. Oxford, UK) 
Rupak Majumdar (MPI-SWS, Germany) 
Andrzej Murawski (U. Leicester, UK) 
Paulo Oliva (Queen Mary U. London, UK) 
Doron Peled (Bar Ilan U., Israel) 
Sukyoung Ryu (KAIST, Korea) 
Sriram Sankaranarayanan (U. Colorado, USA) 
Armando Solar-Lezama (MIT, USA) 
Sam Staton (U. Cambridge, UK)
Viktor Vafeiadis (MPI-SWS, Germany)
Kapil Vaswani (MSR, India) 
Martin Vechev (IBM, USA)
Peng Wu (IBM, USA) 
Hongseok Yang (Queen Mary U. London, UK) 
Pen-Chung Yew (Academia Sinica, Taiwan) 

ORGANIZERS:
Mike Dodds, U. Cambridge (Poster chair)
Shin-Cheng Mu, Academia Sinica, Taiwan (Local arrangement chair)
Noam Rinetzky, Queen Mary U. London (Publicity chair)
Yih-Kuen Tsay, National Taiwan University (Finance chair)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] WGP 2011: Call for Papers

2011-02-18 Thread Shin-Cheng Mu
==
  CALL FOR PAPERS

 WGP 2011

7th ACM SIGPLAN Workshop on Generic Programming
   Tokyo, Japan
Sunday, September 18th, 2011

  http://flolac.iis.sinica.edu.tw/wgp11/

Collocated with the International Conference on Functional Programming
 (ICFP 2011)
==


Goals of the workshop
-

Generic programming is about making programs more adaptable by making
them more general. Generic programs often embody non-traditional kinds
of polymorphism; ordinary programs are obtained from them by suitably
instantiating their parameters. In contrast with normal programs, the
parameters of a generic program are often quite rich in structure; for
example they may be other programs, types or type constructors, class
hierarchies, or even programming paradigms.

Generic programming techniques have always been of interest, both to
practitioners and to theoreticians, and, for at least 20 years,
generic programming techniques have been a specific focus of research
in the functional and object-oriented programming communities. Generic
programming has gradually spread to more and more mainstream
languages, and today is widely used in industry. This workshop brings
together leading researchers and practitioners in generic programming
from around the world, and features papers capturing the state of the
art in this important area.

We welcome contributions on all aspects, theoretical as well as
practical, of

* generic programming,
* programming with (C++) concepts,
* meta-programming,
* programming with type classes,
* programming with modules,
* programming with dependent types,
* polytypic programming,
* adaptive object-oriented programming,
* component-based programming,
* strategic programming,
* aspect-oriented programming,
* family polymorphism,
* object-oriented generic programming,
* and so on.

Organizers
--

Co-Chair
  Jaakko Järvi, Texas AM University, USA
Co-Chair
  Shin-Cheng Mu, Academia Sinica, Taiwan

Programme Committee
---

Dave Abrahams, BoostPro Computing, USA
Magne Haveraaen, Universitetet i Bergen, Norway
Akimasa Morihata, Tohoku University, Japan
Pablo Nogueira, Universidad Politécnica de Madrid, Spain
Ulf Norell, Chalmers University of Technology and University of Gothenberg, 
Sweden
Ross Paterson, City University London, UK
Rinus Plasmeijer, Radboud University Nijmegen, The Netherlands
Sibylle Schupp, Technische Universität Hamburg-Harburg, Germany
Andrew Sutton, Kent State University, USA
Tarmo Uustalu, Institute of Cybernetics, Estonia

Important Information
-

We plan to have formal proceedings, published by the ACM.

Submission details
Deadline for submission: Monday2011-06-06
Notification of acceptance:  Tuesday   2011-07-01
Final submission due:Monday2011-07-25
Workshop:Sunday2011-09-18

Authors should submit papers, in postscript or PDF format, formatted
for A4 paper, to the WGP11 EasyChair instance by the above deadline.
The length should be restricted to 12 pages in standard
(two-column, 9pt) ACM format. Accepted papers are published by the ACM
and will additionally appear in the ACM digital library.

History of the Workshop on Generic Programming
--

This year:

  * Tokyo, Japan 2011 (affiliated with ICFP11)

Earlier Workshops on Generic Programming have been held in

  * Baltimore, Maryland, US 2010 (affiliated with ICFP10)
  * Edinburgh, UK 2009 (affiliated with ICFP09)
  * Victoria, BC, Canada 2008 (affiliated with ICFP),
  * Portland 2006 (affiliated with ICFP),
  * Ponte de Lima 2000 (affiliated with MPC),
  * Marstrand 1998 (affiliated with MPC).

Furthermore, there were a few informal workshops

  * Utrecht 2005 (informal workshop),
  * Dagstuhl 2002 (IFIP WG2.1 Working Conference),
  * Nottingham 2001 (informal workshop),

There were also (closely related) DGP workshops in Oxford (June
3-4 2004), and a Spring School on DGP in Nottingham (April 24-27
2006, which had a half-day workshop attached).
Additional information:

The WGP steering committee consists of J. Gibbons, R. Hinze, P. Jansson,
J. Järvi, J. Jeuring, B. Oliveira, S. Schupp, and M. Zalewski
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Bidirectional programming in Haskell

2009-05-26 Thread Shin-Cheng Mu

Hi,

On Tue May 26 01:21:28 EDT 2009, Artyom Shalkhakov wrote:
 Are there any libraries for bidirectional [1] programming
 in Haskell? Any related work?

Some of the early work in the PSD project (closely related
to the lenses) were developed in Haskell,

  http://www.iis.sinica.edu.tw/~scm/2007/inv/

It is not in active maintenance now, but if you are interested
in doing something with it, I'd be happy to help. :)

Ref:

[1] S-C. Mu, Z. Hu and M. Takeichi,
An injective language for reversible computation.
In Mathematics of Program Construction 2004, LNCS 3125,
pp. 289-313, July 2004.

[2] Z. Hu, S-C. Mu and M. Takeichi,
A programmable editor for developing structured documents
based on bidirectional transformations.
In Partial Evaluation and Semantics-Based Program Manipulation,
pp. 178-189. August 2004


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] APLAS07 - Call for Participation

2007-09-02 Thread Shin-Cheng Mu
 Method for Prolog Programs
  Based on Call Patterns Semantics
  Lingzhong Zhao (Guilin University of Electronic Technology),
  Tianlong Gu (Guilin University of Electronic Technology),
  Junyan Qian (Guilin University of Electronic Technology)
  and Guoyong Cai (Guilin University of Electronic Technology)

  On a Tighter Integration of Functional and Logic Programming
  Frank Huch (CAU Kiel) and Bernd Brassel (CAU Kiel)

1800-2000 Reception

30 Nov (Fri)

 930-1030 Invited Talk

  Scalable Simulation of Biological Signaling Networks
  Vincent Danos (University of Paris VII  CNRS)

1100-1230 Session 4

  Timed, Distributed, Probabilistic, Typed Processes
  Martin Berger (Imperial College London)
  and Nobuko Yoshida (Imperial College London)

  A Probabilistic Applied Pi-Calculus
  Jean Goubault-Larrecq (ENS Cachan),
  Catuscia Palamidessi (Ecole Polytechnique)
  and Angelo Troina (ENS Cachan  Ecole Polytechnique)

  Type-Based Verification of Correspondence Assertions for
  Communication Protocols
  Daisuke Kikuchi (Tohoku University)
  and Naoki Kobayashi (Tohoku University)

1400-1530 Session 5

  Deriving Compilers and Virtual Machines for a Multi-Level  
Language

  Atsushi Igarashi (Kyoto University)
  and Masashi Iwaki (Kyoto University)

  Finally Tagless, Partially Evaluated (Tagless Staged
  Interpreters for Simpler Typed Languages)
  Jacques Carette (McMaster University),
  Oleg Kiselyov (FNMOC) and Chung-chieh Shan (Rutgers  
University)


  Polymorphic Delimited Continuations
  Kenichi Asai (Ochanomizu University)
  and Yukiyoshi Kameyama (University of Tsukuba)

1600-1730 Session 6

  Adjunct Elimination in Context Logic for Trees
  Cristiano Calcagno (Imperial College London),
  Thomas Dinsdale-Young (Imperial College London)
  and Philippa Gardner (Imperial College London)

  Positive Arithmetic without Exchange is a Subclassical Logic
  Stefano Berardi (University of Torino)
  and Makoto Tatsuta (National Institute of Informatics, Japan)

  Mixed Inductive/Coinductive Types and Strong Normalization
  Andreas Abel (University of Munich)

1900-2100 Banquet

 1 Dec (Sat)

 930-1030 Invited Talk

  Static and Dynamic Analysis : Better Together
  Sriram Rajamani (Microsoft Research India)

1100-1230 Session 7

  The Semantics of  Semantic Patches in Coccinelle:  Program
  Transformation for the Working Programmer
  Neil Jones (University of Copenhagen)
  and Rene Rydhof Hansen (University of Copenhagen)

  An Efficient SSA-based Algorithm for Complete Global Value  
Numbering
  Jiu-Tao Nie (Peking University) and Xu Cheng (Peking  
University)


  A Systematic Approach to Probabilistic Pointer Analysis
  Alessandra Di Pierro (University of Verona),
  Chris Hankin (Imperial College London)
  and Herbert Wiklicky (Imperial College London)

1400-1530 Session 8

  Complete Lattices and Up-to Techniques
  Damien Pous (ENS Lyon)

  A Trace Based Bisimulation for the Spi Calculus: An  
Extended Abstract

  Alwen Tiu (Australian National University)

  CCS with Replication in the Chomsky Hierarchy: The Expressive
  Power of Divergence
  Frank Valencia (Ecole Polytechnique),
  Cinzia Di Giusto (University of Bologna),
  Jesus Aranda (Ecole Polytechnique)
  and Mogens Nielsen (University of Aarhus)

1600-1700 Session 9

  Call-by-Name and Call-by-Value in Normal Modal Logic
  Yoshihiko Kakutani (University of Tokyo)

  Call-by-Value is Dual to Call-by-Name, Extended
  Daisuke Kimura (National Institute of Informatics, Japan)

1700-1715 Closing note


Organization
-

GENERAL CHAIR
  Joxan Jaffar (National University of Singapore, Singapore)

PROGRAM CHAIR
  Zhong Shao   (Yale University, USA)

PROGRAM COMMITTEE
  Lars Birkedal(IT University of Copenhagen, Denmark)
  Martin Hofmann   (Univ of Munich, Germany)
  Kohei Honda  (Queen Mary, University of London, UK)
  Atsushi Igarashi (Kyoto University, Japan)
  Suresh Jagannathan   (Purdue University, USA)
  Annie Liu(State University of New York at Stony Brook,  
USA)

  Shin-Cheng Mu(Academia Sinica, Taiwan)
  Henrik Nilsson   (University of Nottingham, UK)
  Michael Norrish  (NICTA, Australia)
  Jens Palsberg(University of California, Los Angeles, USA)
  G. Ramalingam(Microsoft Research, India)
  Zhendong Su  (University of California, Davis, USA)
  Martin Sulzmann  (National University of Singapore, Singapore)
  Eijiro

[Haskell-cafe] Re: creating graphics the functional way

2007-08-06 Thread Shin-Cheng Mu

On 05/08/07, Frank Buss [EMAIL PROTECTED] wrote:
Is it possible to write functions with an arbitrary number of  
arguments?

Would be nice if the average function would accept any number of pixel
values.


You may be interested to see Oleg Kiselyov's discussion on
polyvariadic functions in Haskell:

  http://okmij.org/ftp/Haskell/types.html#polyvar-fn

sincerely,
Shin


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Developing Programs and Proofs Spontaneously using GADT

2007-08-05 Thread Shin-Cheng Mu

Thanks for the comments. I will take a look at the type
family extension.

The definition of plusFn proposed by Jim Apple worked
nicely. The solution from apfelmus works after fixing
a small typo:


newtype Equal a b = Proof (forall f . f a - f b )

newtype Succ f a  = InSucc { outSucc :: f (S a) }

equalZ :: Equal Z Z
equalZ = Proof id

equalS :: Equal m n - Equal (S m) (S n)-- was (S n) (S m)
equalS (Proof eq) = Proof (outSucc . eq . InSucc)

plusFn :: Plus m n i - Plus m n k - Equal i k
plusFn PlusZ PlusZ = Proof id
plusFn (PlusS x) (PlusS y) = equalS (plusFn x y)


Also thanks to Derek Elkins and Dan Licata for interesting
discussions about dependent sum/product. :)

sincerely,
Shin

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Developing Programs and Proofs Spontaneously using GADT

2007-08-04 Thread Shin-Cheng Mu

I am curious about the possibility of developing Haskell programs
spontaneously with proofs about their properties and have the
type checker verify the proofs for us, in a way one would do in
a dependently typed language.

In the exercise below, I tried to redo part of the merge-sort
example in Altenkirch, McBride, and McKinna's introduction to
Epigram [1]: deal the input list into a binary tree, and fold
the tree by the function merging two sorted lists into one.
The property I am going to show is merely that the length of
the input list is preserved.

Given that dependent types and GADTs are such popular topics,
I believe the same must have been done before, and there may be
better ways to do it. If so, please give me some comments or
references. Any comments are welcomed.

 {-# OPTIONS_GHC -fglasgow-exts #-}

To begin with, we define the usual type-level representation
of natural numbers and lists indexed by their lengths.

 data Z = Z   deriving Show
 data S a = S a   deriving Show

 data List a n where
   Nil :: List a Z
   Cons :: a - List a n - List a (S n)

1. Append

To warm up, let us see the familiar append example.
Unfortunately, unlike Omega, Haskell does not provide type
functions. I am not sure which is the best way to
emulate type functions. One possibility is to introduce
the following GADT:

 data Plus m n k where--- m + n = k
   PlusZ :: Plus Z n n
   PlusS :: Plus m n k - Plus (S m) n (S k)

such that Plus m n k represents a proof that m + n = k.

Not having type functions, one of the possible ways to do
append is to have the function, given two lists of lengths
m and n, return a list of length k and a proof that m + n = k.
Thus, the type of append would be:

  append :: List a m - List a n
  - exists k. (List a k, Plus m n k)

In Haskell, the existential quantifier is mimicked by forall.
We define:

 data DepSum a p = forall i . DepSum (a i) (p i)

The term dependent sum is borrowed from the Omega tutorial
of Sheard, Hook, and Linger [2] (why is it called a sum,
not a product?).

The function append can thus be defined as:

 append :: List a m - List a n - DepSum (List a) (Plus m n)
 append Nil ys = DepSum ys PlusZ
 append (Cons x xs) ys =
case (append xs ys) of
  DepSum zs p - DepSum (Cons x zs) (PlusS p)

Another possibility is to provide append a proof that m + n = k.
The type and definition of of append would be:

 append :: Plus m n k - List a m - List a n - List a k
 append PlusZ Nil ys = ys
 append (PlusS pf) (Cons x xs) ys = Cons x (append pf xs ys)

I thought the second append would be more difficult to
use: to append two lists, I have to provide a proof about
their lengths! It turns out that this append actually composes
easier with other parts of the program. We will come to this
later.

2. Some Lemmas

Here are some lemmas represented as functions on terms. The
function, for example, converts a proof of m + (1+n) = k
to a proof of (1+m) + n = k.

 incAssocL :: Plus m (S n) k - Plus (S m) n k
 incAssocL PlusZ = PlusS PlusZ
 incAssocL (PlusS p) = PlusS (incAssocL p)

 incAssocR :: Plus (S m) n k - Plus m (S n) k
 incAssocR (PlusS p) = plusMono p

 plusMono :: Plus m n k - Plus m (S n) (S k)
 plusMono PlusZ = PlusZ
 plusMono (PlusS p) = PlusS (plusMono p)

For example, the following function revcat performs list
reversal by an accumulating parameter. The invariant we
maintain is m + n = k. To prove that the invariant holds,
we have to use incAssocL.

 revcat :: List a m - List a n - DepSum (List a) (Plus m n)
 revcat Nil ys = DepSum ys PlusZ
 revcat (Cons x xs) ys =
 case revcat xs (Cons x ys) of
   DepSum zs p - DepSum zs (incAssocL p)

3. Merge

Apart from the proof manipulations, the function merge
is not very different from what one would expect:

 merge :: Ord a = List a m - List a n - DepSum (List a) (Plus m n)
 merge Nil ys = DepSum ys PlusZ
 merge (Cons x xs) Nil = append (Cons x xs) Nil
 merge (Cons x xs) (Cons y ys)
   | x = y = case merge xs (Cons y ys) of
 DepSum zs p - DepSum (Cons x zs) (PlusS p)
   | otherwise = case merge (Cons x xs) ys of
   DepSum zs p - DepSum (Cons y zs) (plusMono p)

The lemma plusMono is used to convert a proof of
m + n = k to a proof of m + (1+n) = 1+k.

4. Sized Trees

We also index binary trees by their sizes:

 data Tree a n where
   Nul :: Tree a Z
   Tip :: a - Tree a (S Z)
   Bin :: Tree a n1 - Tree a n -
 (Plus p n n1, Plus n1 n k) - Tree a k

The two trees given to the constructor Bin have sizes n1 and n
respectively. The resulting tree, of size k, comes with a proof
that n1 + n = k. Furthermore, we want to maintain an invariant
that n1 either equals n, or is bigger than n by one. This is
represented by the proof Plus p n n1. In the definition of
insertT later, p is either PlusZ or PlusS PlusZ.

5. Lists to Trees

The function insertT inserts an element into a tree:

 insertT :: a - Tree a n - Tree a (S n)
 insertT x Nul = Tip x
 insertT x (Tip y) = 

[Haskell-cafe] Space leak when returning pairs?

2006-05-19 Thread Shin-Cheng Mu

Dear members,

I am experiencing a space leak, which I suspect to be
an instance of the problem addressed by Wadler before.
I'd appreciate if someone here would take a look.

Given the following datatype:

 data XMLEvent = StartEvent String
   | EndEvent   String
   | TextEvent  String  deriving Show

where the three constructors represent the start tag
(a where a is a string), end tag (/a), and text data,
respectively, of an XML stream. (They are actually from
the library HXML). The following function
simply returns the same stream while doing a minimal
amount of validation (ignoring the closing tag).

  idX :: [XMLEvent] - ([XMLEvent], [XMLEvent])
  idX [] = ([], [])
  idX (StartEvent a : strm) =
let (ts, strm') = idX strm
(us, strm'') = idX strm'
in (StartEvent a [] : ts ++ EndEvent a : us, strm'')
  idX (EndEvent _: strm) = ([], strm)
  idX (TextEvent s : strm) =
let (ts, strm') = idX strm
in (TextEvent s : ts, strm')

The function idX returns a pair, where the first component
is the processed stream, while the second component is the
rest of the input. The intention is to thread the input
and release processed events.

If the function is used in a pipelined manner:

   print . fst . idX . parseInstance $ input

where parseInstance :: String - [XMLEvent] is a
lexical analyser, I would expect that the input stream will
be consumed one by one as the output is produced, and the
program would use a constant amount of heap space (while
keeping a program stack proportional to the depth of the
XML tree). This was not the case, however. For some reason
the heap grew linearly. The GHC profiler showed that most
of the thunks are created by parseInstance, which implies
that the entire input stream somehow still resides in memory.

I was wondering where the space leak came from and suspected
that it's the leak described in one of Philip Wadler's
early paper Fixing Some Space Leaks With a Garbage Collector (1987).
Consider

idX (StartEvent a : strm) =
  let (ts, strm') = idX strm
  (us, strm'') = idX strm'
  in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

The body of the let clause might have actually been treated as

 (StartEvent a [] : ts ++
EndEvent a : fst (idX (snd strm)),
  snd (idX (snd strm)))

Therefore strm will not be freed until the whole evaluation
finishes.

But since Wadler has addressed this problem a long time ago,
I think the fix he proposed should have been implemented in
GHC already. Was that really the source of the space leak?
If so is there a way to fix it? Or is there another reason
for the leak?

   *   *   *

The function idX above actually resulted from fusing two functions:
buildF parses a stream into a forest, while idF flattens the
tree.

   data ETree = Element String [ETree]
  | Text String

   buildF :: [XMLEvent] - ([ETree], [XMLEvent])
   buildF [] = ([],[])
   buildF (StartEvent a : strm) =
 let (ts, strm') = buildF strm
 (us, strm'') = buildF strm'
 in (Element a ts : us, strm'')
   buildF (EndEvent _ : strm) = ([], strm)
   buildF (TextEvent s : strm) =
 let (ts, strm') = buildF strm
 in (Text s : ts, strm')

   idF :: [ETree] - [XMLEvent]
   idF [] = []
   idF (Element a ts : us) =
 StartEvent a : idF ts ++ EndEvent a : idF us
   idF (Text s : ts) = TextEvent s : idF ts

My original program was like

print . idF . fst . buildF . parseInstance $ input

and it had the same space leak.

By accident (assuming that the input is always a single tree),
I discovered that if I added a wrap . head into the pipe:

print . idF . wrap . head . fst . buildF . parseInstance $ input

where wrap a = [a], the heap residency would reduce by half!
The output remains the same.

My explanation is that applying a head means that
(in the definition of buildF):

buildF (StartEvent a : strm) =
 let (ts, strm') = buildF strm
 (us, strm'') = buildF strm'
 in (Element a ts : us, strm'')

the us part , which contains a reference to strm, need not
be kept. Thus the reduced heap residency.

This seems to support that the space leak resulted from
the same problem Wadler addressed before. But isn't
that solved already in GHC?

   *   *   *

I'd appreciate if someone could look into it. The actual program
is available at

   http://www.psdlab.org/~scm/hx.tar.gz

It actually uses HXML, a library by Joe (Joe English?) to
do the parsing. The main program is hxpc.hs. There is a
1 MB sized sample input file size1m.xml. There are two
simple scripts recompile and runhxpc for compiling
and running the program.

Thank you!

sincerely,
Shin-Cheng Mu


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak when returning pairs?

2006-05-19 Thread Shin-Cheng Mu

Dear Henning,

On May 19, 2006, at 6:16 PM, Henning Thielemann wrote:

On Fri, 19 May 2006, Shin-Cheng Mu wrote:

   idX :: [XMLEvent] - ([XMLEvent], [XMLEvent])
   idX (StartEvent a : strm) =
 let (ts, strm') = idX strm
 (us, strm'') = idX strm'
 in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

let ~(ts, strm') = idX strm
~(us, strm'') = idX strm'


Oh, I just tried using lazy patterns for the case for StartEvent
and TextEvent. But the space behaviour remained the same. :(

sincerely,
Shin-Cheng Mu
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Space leak when returning pairs?

2006-05-19 Thread Shin-Cheng Mu


On May 19, 2006, at 6:16 PM, Henning Thielemann wrote:

let ~(ts, strm') = idX strm
~(us, strm'') = idX strm'


I seem to have found a partial solution to the problem.
It's rather ugly, however, and I think there should be
a better way.

The original definition for one of the clauses was like
this:

  idX (StartEvent a : strm) =
  let (ts, strm') = idX strm
  (us, strm'') = idX strm'
  in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

The intention was to thread strm through the recursive calls
and free the items in strm as soon as they are processed. However,
with this definition I needed about 8Mb of heap for a 1Mb input
file. Since I suspected that delayed evaluation of us and strm''
keeps the pointer to strm around for too long, the natural solution
seems to be forcing their evaluation. So I tried:

  idX (StartEvent a : strm) =
  case idX strm of
 (ts, strm') -
 case idX strm' of
   (us, strm'') -
 (StartEvent a [] : ts ++ EndEvent a : us, strm'')

which made things worse. The heap size increased a bit more
and I get a typical bell shaped heap profile suggesting idX
cannot output anything before processing the whole input.

So I have to allow idX to output something while consuming
the input. What eventually worked was this messy mixture
of let and case:

  idX (StartEvent a : strm) =
let (xs, rest) =
 case idX strm of
   (ts, strm') -
  let (ws, rest) =
case idX strm' of
  (us, strm'') - (us, strm'')
  in (ts ++ EndEvent a : ws, rest)
in (StartEvent a [] : xs, rest)

The heap residency reduced from about 8MB to around 80Kb.

This is rather tricky, however. The following nicer looking
version has a heap profile as bad as the worse case.

   idX (StartEvent a : strm) =
 let (ts,us,strm'') =
 case idXsp strm of
   (ts, strm') -
   case idXsp strm' of
 (us, strm'') - (ts, us, strm'')
 in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

And I cannot quite explain why (Anyone?).

Is there a more structured solution? Can I leave the hard
work to the compiler?

sincerely,
Shin-Cheng Mu
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe