Re: [racket-users] Re: Pretty display of tabular data?

2019-03-22 Thread travis . hinkelman
I just came across a post  on 
tabular data structures in R, Python, and SQL. The post is written has a 
friendly intro to the subject, which the author claims is a gap that needs 
filling. Thus, the post might not contain much information that is new to 
this group. Perhaps the opportunity is for the Racket community to use that 
friendly intro as a springboard to a comparison for how to approach tabular 
data in Racket.



On Saturday, March 16, 2019 at 3:54:51 PM UTC-7, jackh...@gmail.com wrote:
>
> Hooray! Now we're up to 7 tagged packages 
>  (that was fast!)
>
> On Saturday, March 16, 2019 at 12:13:38 PM UTC-7, johnbclements wrote:
>>
>> Yep, excellent idea. I’ve added the ’tabular’ tag to csv-writing. 
>>
>> John 
>>
>> > On Mar 15, 2019, at 3:24 AM, jackh...@gmail.com wrote: 
>> > 
>> > I think we should all work towards making our existing code in this 
>> area more discoverable, so we can get a better sense of what libraries for 
>> working with tables exist in the wild. To those of you who own Racket 
>> packages that provide any functionality related to data tables: I recommend 
>> adding the "tabular" tag to your package's description in the package 
>> catalog. There's no need to remove more-specific tags (like "data-frame") 
>> from your package, but even if you have a more specific tag please include 
>> the general "tabular" tag so it's easy to search for your package. So far 
>> there's only 3 packages tagged with "tabular" (and one of those is a 
>> package of mine that I just tagged while writing this post). I see several 
>> packages that are good candidates for the tag: 
>> > • data-frame 
>> > • sqlite-table 
>> > • table-panel 
>> > • tabular 
>> > • rml-core (maybe?) 
>> > • sinbad 
>> > • spmatrix (maybe?) 
>> > • spreadsheet-editor 
>> > • csv 
>> > • csv-reading 
>> > • csv-writing 
>> > • simple-csv 
>> > • Most things with the "sql" tag 
>> > The more packages we have tagged and documented, the easier it will be 
>> to find real code using tables in the wild. Which is information we'll need 
>> if we want to understand how a standard `racket/table` API might look. 
>> > 
>> > On Thursday, March 14, 2019 at 10:28:41 AM UTC-7, Ryan Kramer wrote: 
>> > On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote: 
>> > 
>> > There are now several projects announced on this list, all of them deal 
>> with 
>> > data analysis on one way or the other.  Would it be possible to join 
>> forces 
>> > and merge these projects so that we end up with one library that 
>> servers 
>> > multiple purposes equally well?  Something where the final product is 
>> greater 
>> > than the sum of its parts... 
>> > 
>> > Or perhaps these libraries have aims that are so different from each 
>> other 
>> > that the only thing they share is a very abstract concept of "table"? 
>> > 
>> > I think my project "plisqin" is one of those you are thinking of. 
>> Matt's "tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are 
>> there any more you were thinking of? 
>> > 
>> > Regarding joining forces/merging these projects, this is a good 
>> question that I think warrants discussion. So I'll share my thoughts. 
>> > 
>> > Obviously I can't speak for all of us, but right not I only see the 
>> "very abstract concept of "table"" as potential shared code. (Also, 
>> learning about snip% earlier in this thread was awesome. I'd love to use 
>> something like that in my project.) 
>> > 
>> > I think the differences between plisqin and tbl are fairly obvious - 
>> plisqin is an alternative to SQL while tbl is an alternative to 
>> "Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R)" 
>> > 
>> > Now comparing Ryan's sql to plisqin is a different story. These 
>> projects are both alternatives to SQL. But I think there is enough 
>> difference between our approaches and scope to warrant separate projects, 
>> at least for now. 
>> > 1) sql seems to be mostly implemented as macros. plisqin is mostly 
>> implemented as procedures. 
>> > 2) plisqin has some design decisions that some might consider "too much 
>> magic", namely inline joins and "inject-able aggregates" (need better name) 
>> as documented here: https://docs.racket-lang.org/plisqin/intro.html. 
>> Whereas sql-the-package seems to more closely mirror SQL-the-language - it 
>> would be difficult to surprise yourself with the SQL you generate. 
>> > 3) I am trying to design #lang plisqin so that people with no Lisp 
>> experience can use it. (Whether I will succeed is another matter...) 
>> > 
>> > I apologize to Ryan C if I have mischaracterized sql. I'd like to have 
>> a longer conversation about this, but maybe this list is not the right 
>> place. (Also, Ryan, if you think our goals are more similar than I do, I'd 
>> be 

Re: [racket-users] Re: Pretty display of tabular data?

2019-03-16 Thread jackhfirth
Hooray! Now we're up to 7 tagged packages 
 (that was fast!)

On Saturday, March 16, 2019 at 12:13:38 PM UTC-7, johnbclements wrote:
>
> Yep, excellent idea. I’ve added the ’tabular’ tag to csv-writing. 
>
> John 
>
> > On Mar 15, 2019, at 3:24 AM, jackh...@gmail.com  wrote: 
> > 
> > I think we should all work towards making our existing code in this area 
> more discoverable, so we can get a better sense of what libraries for 
> working with tables exist in the wild. To those of you who own Racket 
> packages that provide any functionality related to data tables: I recommend 
> adding the "tabular" tag to your package's description in the package 
> catalog. There's no need to remove more-specific tags (like "data-frame") 
> from your package, but even if you have a more specific tag please include 
> the general "tabular" tag so it's easy to search for your package. So far 
> there's only 3 packages tagged with "tabular" (and one of those is a 
> package of mine that I just tagged while writing this post). I see several 
> packages that are good candidates for the tag: 
> > • data-frame 
> > • sqlite-table 
> > • table-panel 
> > • tabular 
> > • rml-core (maybe?) 
> > • sinbad 
> > • spmatrix (maybe?) 
> > • spreadsheet-editor 
> > • csv 
> > • csv-reading 
> > • csv-writing 
> > • simple-csv 
> > • Most things with the "sql" tag 
> > The more packages we have tagged and documented, the easier it will be 
> to find real code using tables in the wild. Which is information we'll need 
> if we want to understand how a standard `racket/table` API might look. 
> > 
> > On Thursday, March 14, 2019 at 10:28:41 AM UTC-7, Ryan Kramer wrote: 
> > On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote: 
> > 
> > There are now several projects announced on this list, all of them deal 
> with 
> > data analysis on one way or the other.  Would it be possible to join 
> forces 
> > and merge these projects so that we end up with one library that servers 
> > multiple purposes equally well?  Something where the final product is 
> greater 
> > than the sum of its parts... 
> > 
> > Or perhaps these libraries have aims that are so different from each 
> other 
> > that the only thing they share is a very abstract concept of "table"? 
> > 
> > I think my project "plisqin" is one of those you are thinking of. Matt's 
> "tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are there any 
> more you were thinking of? 
> > 
> > Regarding joining forces/merging these projects, this is a good question 
> that I think warrants discussion. So I'll share my thoughts. 
> > 
> > Obviously I can't speak for all of us, but right not I only see the 
> "very abstract concept of "table"" as potential shared code. (Also, 
> learning about snip% earlier in this thread was awesome. I'd love to use 
> something like that in my project.) 
> > 
> > I think the differences between plisqin and tbl are fairly obvious - 
> plisqin is an alternative to SQL while tbl is an alternative to 
> "Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R)" 
> > 
> > Now comparing Ryan's sql to plisqin is a different story. These projects 
> are both alternatives to SQL. But I think there is enough difference 
> between our approaches and scope to warrant separate projects, at least for 
> now. 
> > 1) sql seems to be mostly implemented as macros. plisqin is mostly 
> implemented as procedures. 
> > 2) plisqin has some design decisions that some might consider "too much 
> magic", namely inline joins and "inject-able aggregates" (need better name) 
> as documented here: https://docs.racket-lang.org/plisqin/intro.html. 
> Whereas sql-the-package seems to more closely mirror SQL-the-language - it 
> would be difficult to surprise yourself with the SQL you generate. 
> > 3) I am trying to design #lang plisqin so that people with no Lisp 
> experience can use it. (Whether I will succeed is another matter...) 
> > 
> > I apologize to Ryan C if I have mischaracterized sql. I'd like to have a 
> longer conversation about this, but maybe this list is not the right place. 
> (Also, Ryan, if you think our goals are more similar than I do, I'd be 
> happy to work with you. You're definitely a more experienced Racketeer and 
> it would surely boost my code quality.) 
> > 
> > - Ryan Kramer 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Racket Users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to racket-users...@googlegroups.com . 
> > For more options, visit https://groups.google.com/d/optout. 
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: [racket-users] Re: Pretty display of tabular data?

2019-03-16 Thread 'John Clements' via Racket Users
Yep, excellent idea. I’ve added the ’tabular’ tag to csv-writing.

John

> On Mar 15, 2019, at 3:24 AM, jackhfi...@gmail.com wrote:
> 
> I think we should all work towards making our existing code in this area more 
> discoverable, so we can get a better sense of what libraries for working with 
> tables exist in the wild. To those of you who own Racket packages that 
> provide any functionality related to data tables: I recommend adding the 
> "tabular" tag to your package's description in the package catalog. There's 
> no need to remove more-specific tags (like "data-frame") from your package, 
> but even if you have a more specific tag please include the general "tabular" 
> tag so it's easy to search for your package. So far there's only 3 packages 
> tagged with "tabular" (and one of those is a package of mine that I just 
> tagged while writing this post). I see several packages that are good 
> candidates for the tag:
>   • data-frame
>   • sqlite-table
>   • table-panel
>   • tabular
>   • rml-core (maybe?)
>   • sinbad
>   • spmatrix (maybe?)
>   • spreadsheet-editor
>   • csv
>   • csv-reading
>   • csv-writing
>   • simple-csv
>   • Most things with the "sql" tag
> The more packages we have tagged and documented, the easier it will be to 
> find real code using tables in the wild. Which is information we'll need if 
> we want to understand how a standard `racket/table` API might look.
> 
> On Thursday, March 14, 2019 at 10:28:41 AM UTC-7, Ryan Kramer wrote:
> On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote:
> 
> There are now several projects announced on this list, all of them deal with
> data analysis on one way or the other.  Would it be possible to join forces
> and merge these projects so that we end up with one library that servers
> multiple purposes equally well?  Something where the final product is greater
> than the sum of its parts...
> 
> Or perhaps these libraries have aims that are so different from each other
> that the only thing they share is a very abstract concept of "table"?
> 
> I think my project "plisqin" is one of those you are thinking of. Matt's 
> "tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are there any 
> more you were thinking of?
> 
> Regarding joining forces/merging these projects, this is a good question that 
> I think warrants discussion. So I'll share my thoughts.
> 
> Obviously I can't speak for all of us, but right not I only see the "very 
> abstract concept of "table"" as potential shared code. (Also, learning about 
> snip% earlier in this thread was awesome. I'd love to use something like that 
> in my project.)
> 
> I think the differences between plisqin and tbl are fairly obvious - plisqin 
> is an alternative to SQL while tbl is an alternative to "Python/NumPy/SciPy, 
> or R/Tidyverse (or, horrors, plain R)"
> 
> Now comparing Ryan's sql to plisqin is a different story. These projects are 
> both alternatives to SQL. But I think there is enough difference between our 
> approaches and scope to warrant separate projects, at least for now.
> 1) sql seems to be mostly implemented as macros. plisqin is mostly 
> implemented as procedures.
> 2) plisqin has some design decisions that some might consider "too much 
> magic", namely inline joins and "inject-able aggregates" (need better name) 
> as documented here: https://docs.racket-lang.org/plisqin/intro.html. Whereas 
> sql-the-package seems to more closely mirror SQL-the-language - it would be 
> difficult to surprise yourself with the SQL you generate.
> 3) I am trying to design #lang plisqin so that people with no Lisp experience 
> can use it. (Whether I will succeed is another matter...)
> 
> I apologize to Ryan C if I have mischaracterized sql. I'd like to have a 
> longer conversation about this, but maybe this list is not the right place. 
> (Also, Ryan, if you think our goals are more similar than I do, I'd be happy 
> to work with you. You're definitely a more experienced Racketeer and it would 
> surely boost my code quality.)
> 
> - Ryan Kramer
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-15 Thread Greg Hendershott
This is a great idea. Also I want to point out that:

1. Sometimes it's OK to start by sharing a repo on Git{Hub Lab}. Not
everything needs to go on pkgs.racket-lang.org immediately, to be
visible and share, especially early on.

(To be clear, I'm not saying, "oh only perfect 1.0 things should be a
package". I'm just pointing out that pkgs.r-l.org isn't fantastic for
discoverability, so if that's the main motivation, it's not your only
or even your best option.)

2. If you do have a package that does XYZ, and someone then makes a
package for X, sometimes it's OK to change your package just to
re-`provide` their module for X (and yours still does Y and Z).

For example, my rackjure package had a threading macro. Then Alexis
made a `threading` package. It was 99% compatible, she took a PR for
the 1%, and I changed rackjure to re-provide that. And the docs say
so. So, it didn't break users of rackjure. Plus people could switch to
using `threading` directly, if/when they wanted. And the Racket world
had one less bit of duplicate code. I think that worked out well.

(In fact if that continued to where rackjure was "merely" a
"meta-package" that re-provided focused packages, I'd be fine with
that!)

Maybe that idea could apply here?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-15 Thread jackhfirth
I think we should all work towards making our existing code in this area 
more discoverable, so we can get a better sense of what libraries for 
working with tables exist in the wild. To those of you who own Racket 
packages that provide any functionality related to data tables: I recommend 
adding the "tabular" tag to your package's description in the package 
catalog. There's no need to remove more-specific tags (like "data-frame") 
from your package, but even if you have a more specific tag please include 
the general "tabular" tag so it's easy to search for your package. So far 
there's only 3 packages 
 tagged with 
"tabular" (and one of those is a package of mine that I just tagged while 
writing this post). I see several packages that are good candidates for the 
tag:

   - data-frame
   - sqlite-table
   - table-panel
   - tabular
   - rml-core (maybe?)
   - sinbad
   - spmatrix (maybe?)
   - spreadsheet-editor
   - csv
   - csv-reading
   - csv-writing
   - simple-csv
   - Most things with the "sql" tag
   
The more packages we have tagged and documented, the easier it will be to 
find real code using tables in the wild. Which is information we'll need if 
we want to understand how a standard `racket/table` API might look.

On Thursday, March 14, 2019 at 10:28:41 AM UTC-7, Ryan Kramer wrote:
>
> On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote:
>
>>
>> There are now several projects announced on this list, all of them deal 
>> with
>> data analysis on one way or the other.  Would it be possible to join 
>> forces
>> and merge these projects so that we end up with one library that servers
>> multiple purposes equally well?  Something where the final product is 
>> greater
>> than the sum of its parts...
>>
>> Or perhaps these libraries have aims that are so different from each other
>> that the only thing they share is a very abstract concept of "table"?
>>
>
> I think my project "plisqin" is one of those you are thinking of. Matt's 
> "tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are there any 
> more you were thinking of?
>
> Regarding joining forces/merging these projects, this is a good question 
> that I think warrants discussion. So I'll share my thoughts.
>
> Obviously I can't speak for all of us, but right not I only see the "very 
> abstract concept of "table"" as potential shared code. (Also, learning 
> about snip% earlier in this thread was awesome. I'd love to use something 
> like that in my project.)
>
> I think the differences between plisqin and tbl are fairly obvious - 
> plisqin is an alternative to SQL while tbl is an alternative to 
> "Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R)"
>
> Now comparing Ryan's sql to plisqin is a different story. These projects 
> are both alternatives to SQL. But I think there is enough difference 
> between our approaches and scope to warrant separate projects, at least for 
> now.
> 1) sql seems to be mostly implemented as macros. plisqin is mostly 
> implemented as procedures.
> 2) plisqin has some design decisions that some might consider "too much 
> magic", namely inline joins and "inject-able aggregates" (need better name) 
> as documented here: https://docs.racket-lang.org/plisqin/intro.html. 
> Whereas sql-the-package seems to more closely mirror SQL-the-language - it 
> would be difficult to surprise yourself with the SQL you generate.
> 3) I am trying to design #lang plisqin so that people with no Lisp 
> experience can use it. (Whether I will succeed is another matter...)
>
> I apologize to Ryan C if I have mischaracterized sql. I'd like to have a 
> longer conversation about this, but maybe this list is not the right place. 
> (Also, Ryan, if you think our goals are more similar than I do, I'd be 
> happy to work with you. You're definitely a more experienced Racketeer and 
> it would surely boost my code quality.)
>
> - Ryan Kramer
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-14 Thread Ryan Kramer
On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote:

>
> There are now several projects announced on this list, all of them deal 
> with
> data analysis on one way or the other.  Would it be possible to join forces
> and merge these projects so that we end up with one library that servers
> multiple purposes equally well?  Something where the final product is 
> greater
> than the sum of its parts...
>
> Or perhaps these libraries have aims that are so different from each other
> that the only thing they share is a very abstract concept of "table"?
>

I think my project "plisqin" is one of those you are thinking of. Matt's 
"tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are there any 
more you were thinking of?

Regarding joining forces/merging these projects, this is a good question 
that I think warrants discussion. So I'll share my thoughts.

Obviously I can't speak for all of us, but right not I only see the "very 
abstract concept of "table"" as potential shared code. (Also, learning 
about snip% earlier in this thread was awesome. I'd love to use something 
like that in my project.)

I think the differences between plisqin and tbl are fairly obvious - 
plisqin is an alternative to SQL while tbl is an alternative to 
"Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R)"

Now comparing Ryan's sql to plisqin is a different story. These projects 
are both alternatives to SQL. But I think there is enough difference 
between our approaches and scope to warrant separate projects, at least for 
now.
1) sql seems to be mostly implemented as macros. plisqin is mostly 
implemented as procedures.
2) plisqin has some design decisions that some might consider "too much 
magic", namely inline joins and "inject-able aggregates" (need better name) 
as documented here: https://docs.racket-lang.org/plisqin/intro.html. 
Whereas sql-the-package seems to more closely mirror SQL-the-language - it 
would be difficult to surprise yourself with the SQL you generate.
3) I am trying to design #lang plisqin so that people with no Lisp 
experience can use it. (Whether I will succeed is another matter...)

I apologize to Ryan C if I have mischaracterized sql. I'd like to have a 
longer conversation about this, but maybe this list is not the right place. 
(Also, Ryan, if you think our goals are more similar than I do, I'd be 
happy to work with you. You're definitely a more experienced Racketeer and 
it would surely boost my code quality.)

- Ryan Kramer

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-14 Thread Matt Jadud
>
> There are now several projects announced on this list, all of them deal
> with
> data analysis on one way or the other.  Would it be possible to join forces
> and merge these projects so that we end up with one library that servers
> multiple purposes equally well?  Something where the final product is
> greater
> than the sum of its parts...
>
> Or perhaps these libraries have aims that are so different from each other
> that the only thing they share is a very abstract concept of "table"?
>

Yes?

It makes complete sense, from a practical perspective, to not duplicate
work.

Without bikeshedding ("where should the conversation happen?", "what color
should the logo be?"), there are easier and harder design constraints. For
example, I realized that any sufficiently interesting table interface
would, ultimately, embed a copy of LISP... wait... would be a half-assed
reimplementation of SQL. So, in my rethink, I just set things on top of the
sql library, thus providing the "base language" from which I would work. If
SQL can't do it, it's possible I don't need to do it, and it is 100%
certain that a first-year, who has been programming for 6 weeks, will not
have introductory data questions that cannot be handled by my "target
language."

Keeping a non-leaky abstraction (or, as non-leaky as possible) that lets a
student who is early in HtDP do work with data (in a principled way...
another loaded perspective...) is very important to me. I'll trade all the
fancy databases in the world (as well as the full expressivity of SQL, and
performance for datasets beyond 100K rows, and and and...) for an interface
that does a small number of things very well for novices. If we can do our
design work so that there are demarked shells of increasing complexity,
then yes, I'm confident that we could find ways to combine forces.

If nothing else, I'm already eyeing other libraries that I want to "wrap,"
so that they operate on the substrate I'm laying. I want simplified
plotting (with possibly reduced levels of customization from full 'plot',
either enforced through interfaces or simply enforced by reducing the
documentation for the interface), basic tools for summarizing and analyzing
data (I'm thinking of wrapping the "data-science" library that is floating
around, but not packaged)... so, yes. I don't want to reimplement
everything, but I do need a common substrate. At the moment, I've decided
that anywhere Racket runs (and that my students will use it) is powerful
enough to also have SQLite, and for my time, energy, and task, there are
worse choices than just using SQL.

But, back to bike-shedding... at the least, it might be interesting to kick
around a set of requirements/wants/needs/desires, and from there think
about next steps in design. However, blank-whiteboard design phase work is
challenging in distributed/asynchronous modes, so I'm also concerned that a
mailing list amongst people who do not know each-other is a hard way to do
good design on something nuanced... but, that could just be a failing of
mine. Suggestions for "next steps" on collaboration are something I'm
absolutely open to.

At the end of the day, I need tools for next Fall (ideally, sooner, so I
can begin developing course materials); that's a hard, non-optional
design/implementation deadline, no matter how much interest and goodwill
there is to collaborate.

Cheers,
Matt

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-13 Thread Alex Harsanyi
On Thursday, March 14, 2019 at 9:06:12 AM UTC+8, Matt Jadud wrote:
>
> First, thank you for all the great pointers in this thread. It is clear 
> that different renderings will be useful in different contexts, and there's 
> good libraries to leverage in the community. That's what I was hoping. 
>
> https://bitbucket.org/jadudm/tbl/
>
> (I'll add Github as a second push destination shortly.)
>
> There's a story about how I'm at the point of writing this library. 
>


There are now several projects announced on this list, all of them deal with
data analysis on one way or the other.  Would it be possible to join forces
and merge these projects so that we end up with one library that servers
multiple purposes equally well?  Something where the final product is 
greater
than the sum of its parts...

Or perhaps these libraries have aims that are so different from each other
that the only thing they share is a very abstract concept of "table"?

Alex.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Pretty display of tabular data?

2019-03-13 Thread Matt Jadud
First, thank you for all the great pointers in this thread. It is clear
that different renderings will be useful in different contexts, and there's
good libraries to leverage in the community. That's what I was hoping.

https://bitbucket.org/jadudm/tbl/

(I'll add Github as a second push destination shortly.)

There's a story about how I'm at the point of writing this library. The
short version is that I would like to be able to do simple exploratory data
analysis with relatively small data in simple ways. The word "simple" is
grossly loaded in this context, so I'll just say that I want a library that
supports introductory exploratory data analysis in an HtDP context, and I
want it to have a pedagogic growth path, so that if students go off to use
Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R), then they've had
the conceptual base to know what they want to do, even if the syntax,
semantics, and learning materials are against them.

This was a first dive into starting to think seriously about syntax-case
and syntax-parse, and that probably led me down roads that were not
entirely productive. I did a lot of implementation work as I explored.

In the last few days, I threw out 3000 lines of exploration, and rewrote it
in 300, much more of which is tests. In particular, I decided that
everything I wanted to do could be handled by an in-memory SQLite database,
I could leverage Ryan's excellent 'sql' library, and in doing, effectively
design a small "language" (API? interface? perhaps someday a #lang?) that
wraps operations on that data. However, the design of that is subject to
discussion and debate, and it might be that the library ultimately
encapsulates more than one interface, so that different kinds of data
questions can be asked differently.

So, the abstractive lift of using db/sql was huge, and I also like
rackunit/chk. I'm also wrapping some parts of plot, so that I can have
really, really short pathways to investigating data. It's early days on the
pieces (which, in the rewrite, I sprinted based depth to instead stitch a
complete pipeline in the name of proof-of-concepting the choice to backend
to SQLite). Wrapping everything under a single require, etc., hasn't
happened yet, testing is reasonably underway, and documentation on the
rewrite is currently lagging.

#lang racket
(require tbl/reading/gsheet
 tbl/plot)

;; The source Google Sheet: http://bit.ly/cities-gsheet
;; read-gsheet takes a version published/shared as a CSV
(define T (read-gsheet "http://bit.ly/cities-csv;))
(show (scatter T "LonD" "LatD"))

These two lines let me read in a CSV published via Google Sheets, and get
a  scatterplot in DrRacket.

So, that's a long story. However, I'd welcome dialogue. I may come back
with some specific questions. For the moment, I'm exploring. I had (and
will have again) the ability to slurp in SQL databases (SQLite, MySQL,
etc.), I currently do CSV files, and would like to output a number of these
formats as well. In terms of plotting, I'd like to support basics (think
early chapters of Tukey) with some customization, but ultimately know that
I can always drop down to full 'plot' if I need to.

The output of the table question is so that students can have a richer view
into the tables they're working with. A lot of good pointers were in this
thread.

That's long, but there you go. That's the story. A short version may be
"I'm standing on the shoulders of giants," because the rewrite feels like a
wrapper around sql and db... which, frankly, is lovely. (And, I'm almost
starting to understand how to use the various quasiquoting syntactic forms
in the sql language to build my own frankensteined queries...)

Cheers,
M




On Wed, Mar 13, 2019 at 4:59 PM  wrote:

> I've wanted this too, and got the sense that working with `snip%` instead
> of `gen:custom-write` was 1) the way to go and 2) very difficult. Are you
> planning on using this in some open source code you have right now in a
> github repo or something similar? I'd like to bookmark it.
>
> On Wednesday, March 13, 2019 at 11:19:07 AM UTC-7, Matt Jadud wrote:
>>
>> Hi all,
>>
>> I have a tabular data type that I'd like (I think) to be able to render
>> it either in ASCII or in a prettier way in the Interactions pane. I've
>> explored gen:write and friends, and can get the struct to display the way I
>> want---with ASCII. Essentially easy-peasy.
>>
>> What I wonder is: am I able to do something prettier? Can I encapsulate
>> some kind of styled rendering as a snip%, or... something... so that I can
>> render the first 5 and last 5 rows of a table with bolding of headers,
>> etc.?
>>
>> I don't know where to start, essentially, if I wanted to try and do this.
>> Or, perhaps it is not particularly doable.
>>
>> Pointers to examples in codebases are welcome (if such examples exist),
>> and I can work from there. Or, indications that this might be really
>> difficult are also welcome.
>>
>> Cheers,
>> Matt
>>
>> (Apologies if this