Re: [Haskell-cafe] Two GHC-related GSoC Proposals

2013-06-02 Thread Boris Lykah
It is not obvious that semantics is preserved for optimisations which
remove non-constants like

bar a b = a + b - a - b -- the RHS is should be optimized away to 0

Calling bar undefined undefined throws an error, but the optimised bar
would return 0.

On Sat, Jun 1, 2013 at 8:10 PM, Patrick Palka patr...@parcs.ath.cx wrote:

 Yeah, I am going to use the MVar approach. Alternative implementations
 will be investigated if this approach happens to not scale well.


 On Fri, May 31, 2013 at 9:10 AM, Thomas Schilling nomin...@googlemail.com
  wrote:

 [I'll be the mentor for this GSoC project.]

 I used the MVar approach a while ago and so did Simon Marlow's
 original solution.  Using MVars and Threads for this should scale well
 enough (1000s of modules) and be relatively straightforward.
 Error/exception handling could be a bit tricky, but you could use (or
 copy ideas from) the 'async' package to deal with that.

  / Thomas

 On 30 May 2013 18:51, Ryan Newton rrnew...@gmail.com wrote:
  What's the plan for what control / synchronization structures you'll
 use in
  part 2 of the plan to implement a parallel driver?
 
  Is the idea just to use an IO thread for each compile and block them on
  MVars when they encounter dependencies?  Or you can use a pool of worker
  threads and a work queue, and only add modules to the work queue when
 all
  their dependencies are met (limits memory use)... many options for
 executing
  a task DAG.  Fortunately the granularity is coarse.
 
-Ryan
 
 
 
  On Sun, Apr 21, 2013 at 10:34 PM, Patrick Palka patr...@parcs.ath.cx
  wrote:
 
  Good points. I did not take into account whether proposal #2 may be
 worth
  it in light of -fllvm. I suppose that even if the LLVM codegen is able
 to
  perform similar optimizations, it would still be beneficial to
 implement
  proposal #2 as a core-to-core pass because the transformations it
 performs
  would expose new information to subsequent core-to-core passes. Also,
  Haskell has different overflow rules than C does (whose rules I assume
  LLVM's optimizations are modeled from): In Haskell, integer overflow is
  undefined for all integral types, whereas in C it's only undefined for
  signed integral types. This limits the number of optimizations a
 C-based
  optimizer can perform on unsigned arithmetic.
 
  I'm not sure how I would break up the parallel compilation proposal
 into
  multiple self-contained units of work. I can only think of two units:
 making
  GHC thread safe, and writing the new parallel compilation driver. Other
  incidental units may come up during development (e.g. parallel
 compilation
  might exacerbate #4012), but I still feel that three months of full
 time
  work is ample time to complete the project, especially with existing
  familiarity with the code base.
 
  Thanks for the feedback.
 
 
  On Sun, Apr 21, 2013 at 5:55 PM, Carter Schonwald
  carter.schonw...@gmail.com wrote:
 
  Hey Patrick,
  both are excellent ideas for work that would be really valuable for
 the
  community!
  (independent of whether or not they can be made into GSOC sided
 chunks )
 
  ---
  I'm actually hoping to invest some time this summer investigating
  improving the numerics optimization story in ghc. This is in large
 part
  because I'm writing LOTs of numeric codes in haskell presently
 (hopefully on
  track to make some available to the community ).
 
  That said, its not entirely obvious (at least to me) what a tractable
  focused GSOC sized subset of the numerics optimization improvement
 would be,
  and that would have to also be a subset that has real performance
 impact and
  doesn't benefit from eg using -fllvm rather than -fasm .
  -
 
  For helping pave the way to better parallel builds, what are some self
  contained units of work on ghc (or related work on cabal) that might
 be
  tractable over a summer? If you can put together a clear roadmap of
 work
  chunks that are tractable over the course of the summer, I'd favor
 choosing
  that work, especially if you can give a clear outline of the plan per
 chunk
  and how to evaluate the success of each unit of work.
 
  basically: while both are high value projects, helping improve the
  parallel build tooling (whether in performance or correctness or
 both!) has
  a more obvious scope of community impact, and if you can layout a
 clear plan
  of work that GHC folks agree with and seems doable, i'd favor that
 project
  :)
 
  hope this feedback helps you sort out project ideas
 
  cheers
  -Carter
 
 
 
 
  On Sun, Apr 21, 2013 at 12:20 PM, Patrick Palka patr...@parcs.ath.cx
 
  wrote:
 
  Hi,
 
  I'm interested in participating in the GSoC by improving GHC with
 one of
  these two features:
 
  1) Implement native support for compiling modules in parallel (see
  #910). This will involve making the compilation pipeline thread-safe,
  implementing the logic for building modules in parallel (with an
 emphasis on
  keeping compiler output deterministic), and 

[Haskell-cafe] ANN: Groundhog 0.3 - mysql, schemas and enhanced queries

2013-04-19 Thread Boris Lykah
I am happy to announce release of Groundhog 0.3!

Groundhog is a library for high-level database access. It has support for
Sqlite, PostgreSQL, and a new MySQL backend. Advanced migration
capabilities allow you to precisely specify the schema description, fitting
it to an existing database, or creating a migration script for a new one.
Groundhog is not opinionated about schema and can bind your datatypes to a
relational model which may have composite keys, references across different
schemas, indexes, etc.

Changes:
* Novel mechanism for defining your own functions, operators, or other
expressions and combining them in a type-safe manner. The operators LIKE
and IN, which appeared in this version, are one-liners based on it. Now you
can write advanced queries compositionally without resorting to raw SQL.
For example:
project (upper (ZipCodeField `append` AddressField), NumberField *
(NumberField + (1 :: Int))) $ lower NameField `like` jack%
will be transformed into SQL query
SELECT upper(zipcode || address), number * (number + ?) FROM tablename
WHERE lower(name) like ?
* PostgreSQL backend got support for arrays and geometric datatypes.
* First release of MySQL backend (influenced by persistent-mysql).
* New class ConnectionManager helps to extract connections from pools,
application state, or other sources. Now both plain connections and pools
can be run with the same function runDbConn. In particular, this class
helps with integrating Groundhog into web applications.
* Support for savepoints (nested transactions).
* Schema qualified tables. The entities can be configured with a schema
parameter. It defines schema where the table, its triggers, indexes, and
other related database objects are stored. Schemas are respected both
during migration and when generating queries.
* Custom primary keys. A unique set of fields can be defined as a
constraint, index, or primary key. Composite primary keys are supported as
well.
* Configuration of reference clauses ON DELETE and ON UPDATE.
* New class SchemaAnalyzer exposes Groundhog functions for database
introspection. In future it can be used to build more powerful migration
tools, or generate datatypes from table definitions.

The full description of the configuration options is available at
http://hackage.haskell.org/packages/archive/groundhog-th/0.3.0/doc/html/Database-Groundhog-TH.html
Please see examples at
http://github.com/lykahb/groundhog/tree/master/examples.

If you want to integrate Groundhog into your application which is based on
Snap, Yesod, or just uses another monad to store connections, see this
example
http://github.com/lykahb/groundhog/blob/master/examples/monadIntegration.hs.
Although there are many new features in this release, Groundhog API is
mature enough to minimize the breaking changes. To migrate from Groundhog
2.* replace runSqliteConn/runPostgresqlConn/runSqlitePool/runPostgresqlPool
with runDbConn.

Special thanks to MightyByte who suggested to generalize connections and
helped to find the design, and to many others who gave their inspiring
feedback.

* http://github.com/lykahb/groundhog
* http://hackage.haskell.org/package/groundhog
* http://hackage.haskell.org/package/groundhog-th
* http://hackage.haskell.org/package/groundhog-mysql
* http://hackage.haskell.org/package/groundhog-postgresql
* http://hackage.haskell.org/package/groundhog-sqlite

Enjoy!

-- 
Regards,
Boris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: Groundhog 0.3 - mysql, schemas and enhanced queries

2013-04-19 Thread Boris Lykah
Thank you, Tom.

I've uploaded a new version of groundhog-th which builds on GHC 7.6.2. The
new documentation will be generated soon.

Please use this link
http://hackage.haskell.org/packages/archive/groundhog-th/latest/doc/html/Database-Groundhog-TH.html



On Fri, Apr 19, 2013 at 12:43 PM, Tom Ellis 
tom-lists-haskell-cafe-2...@jaguarpaw.co.uk wrote:

 On Fri, Apr 19, 2013 at 11:42:14AM +0300, Boris Lykah wrote:
  The full description of the configuration options is available at
 
 http://hackage.haskell.org/packages/archive/groundhog-th/0.3.0/doc/html/Database-Groundhog-TH.html

 Hi Boris, the docs for 0.3.0 don't currently seem to exist.

 Tom

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
Regards,
Boris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: Groundhog 0.2 - now with database index support

2012-10-10 Thread Boris Lykah
I am excited to announce a new release of Groundhog 0.2 - a library
for high-level database access.

* http://hackage.haskell.org/package/groundhog
* http://hackage.haskell.org/package/groundhog-th
* http://hackage.haskell.org/package/groundhog-postgresql
* http://hackage.haskell.org/package/groundhog-sqlite
* http://github.com/lykahb/groundhog

Since the last release I got a lot of positive feedback which showed
that many people are interested in the project. I want to thank them
for their encouragement and constructive suggestions that helped me to
adjust the development priorities. To help keeping track of the
Groundhog project development, the most important changes of a release
are now logged to the HISTORY files in repository.

Along with many small improvements and refactorings, this major
release has two novel features:

* Support for database unique indexes. Now the unique keys can be
created with either constraints or indexes. In the API the difference
affects only the code generation settings which have a new field
* Customizing column type name for schema migration. This contributes
to compatibility with already existing databases and allows
utilization of the database-specific features like limiting string
size or number precision. See example at
https://github.com/lykahb/groundhog/blob/master/examples/dbSpecificTypes.hs

Package groundhog-th has improved error reporting. Now it precisely
reports the line and position where parsing of the YAML-based settings
failed. This required a patch in the yaml dependency, so other yaml
package users have this fix as well. The full description of the YAML
settings format is available at
http://hackage.haskell.org/packages/archive/groundhog-th/0.2.0/doc/html/Database-Groundhog-TH.html

I appreciate your feedback.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Groundhog 0.1.0.1 - high-level database library

2012-09-14 Thread Boris Lykah
Groundhog has design very similar to Persistent, so I would choose it
for comparison. I don't have much of experience of using Persistent,
so some facts may be inaccurate for the newer versions. The facts are
based mostly on the Persistent documentation. You may be also
interested in comparison of Persistent, HaskellDB, and Esqueleto from
the announce of the library Esqueleto by Felipe Lessa
http://blog.felipe.lessa.nom.br/?p=68

* Persistent creates both data definitions and auxiliary structures.
This approach is not modular because it ties the data to a specific
library. Groundhog examines existing data definitions defined in a
normal way and uses this data to create the auxiliary structures.
* Persistent entities always have autoincrement integer(for MongoDB
bytestring) key. Groundhog entities may omit it and use a natural
(composites are supported) key. An entity may have several keys.
* Persistent stores embedded datatypes as a JSON string. Groundhog
flattens the inner fields to several columns, which enables full
access to inner fields of an embedded datatype.
* Groundhog supports typesafe projections. Persistent can do them only
with raw SQL or esqueleto.
* Groundhog has expressive query DSL which enables comparing
field-to-value, field-to-field, and use arbitrary arithmetic
expressions. Persistent only supports field-to-value expressions with
only one operator.
* Groundhog does not support joins. Persistent supports only
one-to-many joins for two tables, but esqueleto supports many types of
joins for multiple tables.
* Groundhog currently does not support Conduit or other
resource-management libraries, IN clause, and migrations that add NOT
NULL columns to non-empty tables.
* Groundhog supports Sqlite and PostgreSQL. Persistent supports
Sqlite, PostgreSQL, MySQL, and MongoDB.

On Fri, Sep 14, 2012 at 1:28 AM, Tom Murphy amin...@gmail.com wrote:
 How does this compare with other high-level Haskell db libraries?

 Tom

 On Sep 13, 2012 2:25 PM, Boris Lykah lyk...@gmail.com wrote:

 I am happy to announce a new version of Groundhog, a library for fast
 high-level database access:
 http://hackage.haskell.org/package/groundhog
 http://hackage.haskell.org/package/groundhog-th
 http://hackage.haskell.org/package/groundhog-postgresql
 http://hackage.haskell.org/package/groundhog-sqlite

 Groundhog has been completely overhauled since the last release.
 Notably, it got support for PostgreSQL and natural foreign keys. I
 believe that it is a big step forward as this brings more flexibility
 to the design of the relational schemas while keeping the applications
 independent of the storage layer. Some of the solutions, particularly
 schema migration were based on Persistent code.

 Please see examples at
 http://github.com/lykahb/groundhog/tree/master/examples.

 Features:
 * Support for Sqlite and PostgreSQL.
 * Natural and composite foreign keys. Earlier it was possible to
 reference an entity only by the mandatory integer primary key. Now an
 entity can have several keys including autoincrement primary key
 (optional) and unique keys which have one or more columns.
 * Full support of embedded datatypes. You can access a field that
 contains an embedded datatype as a whole, or access some of the inner
 subfields individually. This powerful mechanism has allowed
 implementation of the composite keys, and can be used in future to
 work with PostgreSQL composite types or MongoDB embedded documents.
 Instead of serializing value to string, the Groundhog backends flatten
 tree of embedded datatypes to db columns.
 * Projections. You can choose what columns to query from a table in a
 type-safe manner.
 * Initialization and migration of database schema.
 * Sum types and polymorphic types.
 * Expression DSL for use in queries.
 * Basic list support.
 * YAML-based settings mechanism. It separates datatype definition and
 description which facilitates modularity. The settings are inferred
 from the analysis of the difinition, and overridden with values set by
 user.

 The Criterion benchmarks are available at
 http://lykahb.github.com/groundhog/SqliteBench.html and
 http://lykahb.github.com/groundhog/PostgreSQLBench.html.

 Future plans:
 * Support for joins
 * Database indexes
 * Investigate options for implementing MongoDB and MySQL backends

 Your feedback, suggestions for improvement and criticism are welcome.

 --
 Regards,
 Boris

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Regards,
Boris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: Groundhog 0.1.0.1 - high-level database library

2012-09-13 Thread Boris Lykah
I am happy to announce a new version of Groundhog, a library for fast
high-level database access:
http://hackage.haskell.org/package/groundhog
http://hackage.haskell.org/package/groundhog-th
http://hackage.haskell.org/package/groundhog-postgresql
http://hackage.haskell.org/package/groundhog-sqlite

Groundhog has been completely overhauled since the last release.
Notably, it got support for PostgreSQL and natural foreign keys. I
believe that it is a big step forward as this brings more flexibility
to the design of the relational schemas while keeping the applications
independent of the storage layer. Some of the solutions, particularly
schema migration were based on Persistent code.

Please see examples at http://github.com/lykahb/groundhog/tree/master/examples.

Features:
* Support for Sqlite and PostgreSQL.
* Natural and composite foreign keys. Earlier it was possible to
reference an entity only by the mandatory integer primary key. Now an
entity can have several keys including autoincrement primary key
(optional) and unique keys which have one or more columns.
* Full support of embedded datatypes. You can access a field that
contains an embedded datatype as a whole, or access some of the inner
subfields individually. This powerful mechanism has allowed
implementation of the composite keys, and can be used in future to
work with PostgreSQL composite types or MongoDB embedded documents.
Instead of serializing value to string, the Groundhog backends flatten
tree of embedded datatypes to db columns.
* Projections. You can choose what columns to query from a table in a
type-safe manner.
* Initialization and migration of database schema.
* Sum types and polymorphic types.
* Expression DSL for use in queries.
* Basic list support.
* YAML-based settings mechanism. It separates datatype definition and
description which facilitates modularity. The settings are inferred
from the analysis of the difinition, and overridden with values set by
user.

The Criterion benchmarks are available at
http://lykahb.github.com/groundhog/SqliteBench.html and
http://lykahb.github.com/groundhog/PostgreSQLBench.html.

Future plans:
* Support for joins
* Database indexes
* Investigate options for implementing MongoDB and MySQL backends

Your feedback, suggestions for improvement and criticism are welcome.

-- 
Regards,
Boris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: Groundhog-0.0.1.1, a database connectivity library

2011-06-17 Thread Boris Lykah
Hi, everyone! I am happy to announce a new persistence library, Groundhog.

http://hackage.haskell.org/package/groundhog
http://hackage.haskell.org/package/groundhog-sqlite

Groundhog does mapping between datatypes and database like Persistent
in Haskell or Hibernate in Java and makes dealing with database very
easy. The API is inspired by Persistent, a great framework by Michael
Snoyman, from which I used some ideas and code. At first I wanted to
enhance Persistent, but many useful ideas don't fit into its design.
Groundhog offers features I wanted to see in a modern Haskell database
library.

My intent was to create a package which you can plug into your
existing project and it would just work. So it does not require
defining datatypes using its  own mechanisms (quasiquotation, XML,
etc). You can use your own types defined anywhere. The restrictions on
type structure are very mild. Groundhog uses data families and GADTs
and they did not play nicely together until GHC 7, so GHC 6.12.x and
earlier is not supported.

Currently there is support only for Sqlite but I hope to add more
backends soon. Sqlite backend is based on direct-sqlite package by Dan
Knapp. I modified it to improve performance and provide better error
messages. Now it is bundled with groundhog-sqlite but I hope to merge
it with direct-sqlite.

On simple datatypes performance is ~2.5 times higher compared to
Persistent. Some of this gain is achieved because of Sqlite specific
optimisations, but I expect to see high performance on other backends
as well. In fact, it could be even faster. I sacrificed ~30% of
backend-independent performance for flexibility when I chose DbPersist
to be a monad transformer instead of sticking with IO and replaced
direct mentions of DbPersist with Monad m in PersistBackend.

Features:
* Persists datatypes defined in an ordinary way
* Supports fields of user-defined types
* Supports polymorphic datatypes and datatypes with several constructors
* Basic support for lists and tuples
* Type safety
* Migration from an empty schema
* Powerful expression DSL for use in queries
* Execution of arbitrary queries
* High performance

Plans (in priority order):
* Add more backends, particularly PostgreSQL
* Allow migration when data definition changes
* Add Persistent-like quasiquotation syntax

Here is an example from the documentation:

{-# LANGUAGE GADTs, TypeFamilies, TemplateHaskell #-}
import Control.Monad.IO.Class(liftIO)
import Database.Groundhog.TH
import Database.Groundhog.Sqlite

data Customer a = Customer {customerName :: String, details :: a} deriving Show
data Item = ProductItem {productName :: String, quantity :: Int,
customer :: Customer String}
  | ServiceItem {serviceName :: String, deliveryAddress ::
String, servicePrice :: Int}
 deriving Show

deriveEntity ''Customer $ Just $ do
  setConstructor 'Customer $ do
setConstraints [(NameConstraint, [customerName])]
deriveEntity ''Item Nothing

main = withSqliteConn :memory: $ runSqliteConn $ do
  -- Customer is also migrated because Item contains it
  runMigration silentMigrationLogger $ migrate (undefined :: Item)
  let john = Customer John Doe Phone: 01234567
  johnKey - insert john
  -- John is inserted only once because of the name constraint
  insert $ ProductItem Apples 5 john
  insert $ ProductItem Melon 2 john
  insert $ ServiceItem Taxi Elm Street 50
  insert $ ProductItem Melon 6 (Customer Jack Smith Don't let him
pay by check)
  -- bonus melon for all large melon orders
  update [QuantityField =. toArith QuantityField + 1]
(ProductNameField ==. Melon . QuantityField . (5 :: Int))
  productsForJohn - select (CustomerField ==. johnKey) [] 0 0
  liftIO $ putStrLn $ Products for John:  ++ show productsForJohn
  -- check bonus
  melon - select (ProductNameField ==. Melon) [Desc QuantityField] 0 0
  liftIO $ putStrLn $ Melon orders:  ++ show melon

Currently Hackage cannot build it due to technical issues (as Dons
assumed several packages it depends on are not exposed), but you can
install it with cabal.

It is still very early beta and it may have some bugs. I am very
interested to hear your feedback.

Thanks,
Boris Lykah

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Template Haskell support for GADTs

2011-04-30 Thread Boris Lykah
Hi all!

I am writing a library which allows to refer to the separate fields of
a datatype. The fields are described as GADT with one constructor for
each field. The constructors return GADT with the field type. The
auxiliary data structures for this should be generated automatically
via Template Haskell.

I found that the GADTs produced by TH are not equivalent to the usual
ones. They require additional  extension -XTypeFamilies(for the
equality constraints) along with -XGADTs, and, which is more
important, are less type-safe. To describe the data structure I used
ForallC as it was suggested in the closed ticket
http://hackage.haskell.org/trac/ghc/ticket/3497 (Template Haskell
support for GADTs)

This is simplified code for data Sample = Sample {foo::String,
bar::Int} written manually:

data SampleField a where
  FooField :: SampleField String
  BarField :: SampleField Int

and a GADT with similar structure generated via TH.

$(do
let gadt = mkName THSampleField
let tv = mkName a
let con1 = forallC [plainTV tv] (cxt $ [equalP (varT tv) (conT
''String)]) $ normalC (mkName THFooField) []
let con2 = forallC [plainTV tv] (cxt $ [equalP (varT tv) (conT
''Int)]) $ normalC (mkName THBarField) []
result - dataD (cxt []) gadt [plainTV tv] [con1, con2] []
return [result]
 )
which produces
data THSampleField a where
  THFooField :: a ~ String = THSampleField a
  THBarField :: a ~ Int = THSampleField a

The expression
asTypeOf FooField BarField
fails to compile as expected because String cannot match Int, but
asTypeOf THFooField THBarField
is a valid expression of type (THSampleField a), which is very
confusing and breaks the existing code.

Am I missing something? If this is the only way to create the GADTs
then I think we should consider reopening the ticket.

-- 
Regards,
Boris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe