[Haskell-cafe] representing spreadsheets
Hi everyone, I'm hoping someone can point me in the right direction for a project I'm working on. Essentially I would like to represent a grid of data (much like a spreadsheet) in pure code. In this sense, one would need functions to operate on the concepts of rows and columns. A simple cell might be represented like this: data Cell = CellStr Text | CellInt Integer | CellDbl Double | CellEmpty The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc. Representing cells in a matrix makes the most sense to me, in order to facilitate access by columns or rows or both, and I'd like to know if there's a particular matrix library that would work well with this idea. However, I'm certainly open to any other data structures that may be better suited to the task. Thanks! Eric ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
Hi Eric A spreadsheet is an indexed / tabular structure which doesn't map well to Haskell's built-in way of defining data - algebraic types - which are trees via sums and products. Wolfram Kahl has a paper on modelling tables in Haskell Compositional Syntax and Semantics of Tables which might be interesting / useful: tables look like they have strong similarities to spreadsheets and the implementation is included in the appendix. Unfortunately the code is very complicated - I say this intending no criticism or judgement of Wolfram's work, just that it takes a lot of type system power to get over the representation mismatch between trees and tables. Wolfram Kahl - Compositional Syntax and Semantics of Tables http://www.cas.mcmaster.ca/sqrl/papers/sqrl15.pdf Best wishes Stephen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
Hi, Eric Rasmussen wrote: The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc. Sounds more like a database than like a spreadsheet. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas. Tillmann -- you are correct in that it's very similar to a database. I frequently go through this process: 1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format. It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string Purchase amount and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements. On Fri, May 27, 2011 at 12:33 PM, Tillmann Rendel ren...@informatik.uni-marburg.de wrote: Hi, Eric Rasmussen wrote: The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc. Sounds more like a database than like a spreadsheet. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen ericrasmus...@gmail.comwrote: Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas. Tillmann -- you are correct in that it's very similar to a database. I frequently go through this process: 1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format. Worth it in what terms? You're either going to have to encode the relationships yourself, or else automate the process. It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string Purchase amount and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements. My intuition is to use some kind of initial algebra to create a list-like structure /for each record/ For example, with GADTs:. data Field a = Field { name :: String } data Value a = Value { value :: a } Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a - RecordScheme - RecordScheme And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records. data Record where EndOfRecord :: Record Populate:: Value a - Record - Record type Table = [Record] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen ericrasmus...@gmail.com wrote: Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas. Tillmann -- you are correct in that it's very similar to a database. I frequently go through this process: 1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format. Worth it in what terms? You're either going to have to encode the relationships yourself, or else automate the process. It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string Purchase amount and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements. My intuition is to use some kind of initial algebra to create a list-like structure /for each record/ For example, with GADTs:. data Field a = Field { name :: String } data Value a = Value { value :: a } Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a - RecordScheme - RecordScheme And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records. data Record where EndOfRecord :: Record Populate:: Value a - Record - Record type Table = [Record] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] representing spreadsheets
Thanks! I think GADTs may work nicely for this project, so I'm going to start building it out. On Fri, May 27, 2011 at 4:16 PM, Alexander Solla alex.so...@gmail.comwrote: On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen ericrasmus...@gmail.comwrote: Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas. Tillmann -- you are correct in that it's very similar to a database. I frequently go through this process: 1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format. Worth it in what terms? You're either going to have to encode the relationships yourself, or else automate the process. It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string Purchase amount and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements. My intuition is to use some kind of initial algebra to create a list-like structure /for each record/ For example, with GADTs:. data Field a = Field { name :: String } data Value a = Value { value :: a } Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a - RecordScheme - RecordScheme And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records. data Record where EndOfRecord :: Record Populate:: Value a - Record - Record type Table = [Record] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe