[racket-users] Printing Quickly

2017-07-24 Thread Lehi Toskin
I have noticed that printing in DrRacket is rather slow and I'm wondering if 
that's something I could speed up. If I have a loop that runs and prints a 
whole lot of information each loop, were I to become overwhelmed and click the 
Stop button, it would take a minute or so before my wishes made it through to 
the program.

Running the same hypothetical program from the command line, the printing 
process is executed much faster and if I Ctrl-C the change is instantaneous.

Am I making any sense?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Jon Zeppieri
On Mon, Jul 24, 2017 at 4:40 PM, Jay McCarthy  wrote:
> On Mon, Jul 24, 2017 at 3:18 PM, Alejandro Sanchez
>  wrote:
>>> - I'm curious of the performance. In particular, I would expect that a
>>> computed jump in unpack could do you good. Did you try that?
>> I haven’t investigated performance yet. As I said, I am new to Racket, this 
>> is my first time doing anything useful in it, my only previous Scheme 
>> knowledge was from doing the exercises in SICP and dabbling in Guile a bit. 
>> What is a computed jump?
>
> Rather than having a big `cond`, you could look up the function that
> does the work in a vector and then call it. IMHO, msgpack was designed
> with that in mind, because tags that aren't immediate values are all
> nicely ordered. So you'd check the size, subtract a constant, and grab
> the appropriate procedure from a constant vector.
>

Or you can use `case`. Racket's `case` with densely-distributed fixnum
constants (like in your code) will:

1. Use a vector lookup to go from the case label (i.e. the tag value)
to the index of the RHS clause.
2. Use an open-coded binary search on the index to get to the code itself.

(The technique is described here:
http://scheme2006.cs.uchicago.edu/07-clinger.pdf)

You might want to define a macro on top of `case`, though, to handle
ranges of tags, since `case` doesn't have syntax for that.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
On Monday, July 24, 2017 at 11:36:00 PM UTC+2, Daniel Prager wrote:
> Hi Zelphir
> 
> Thanks for the attribution.
> 
> I'm running on a MacBook Air, 2012 vintage.
> 
> Why not run both my and your code on your machine and compare?
> 
> I made no optimisations other than assuming binary classification.
> 
> 
> 
> Dan
> 
> 
> On Tue, Jul 25, 2017 at 6:46 AM, Zelphir Kaltstahl  
> wrote:
> 
> 
> 
> 
> With my implementation of a list of vectors I only get down to:
> 
> 
> 
> cpu time: 996 real time: 994 gc time: 52
> 
> 
> 
> on my machine. Now I don't know what kind of machine you have, but I guess 
> with such small data sets it does not matter that much and the list of lists 
> implementation is faster, at least for low dimensional data :) It seems 
> vectors involve a bit of overhead or you did some other optimization, which I 
> still have to add to my code. (Maybe assuming binary class, but that should 
> not make that much of a difference, I think. Might try that soon.)
> 
> 
> 
> I added your code as a new file and added a comment at the top of the file:
> 
> 
> 
> #|
> 
> Attribution:
> 
> 
> 
> This implementation of decision trees in Racket was written by Daniel Prager 
> and
> 
> was originally shared at:
> 
> 
> 
> https://groups.google.com/forum/#!topic/racket-users/cPuTr8lrXCs
> 
> 
> 
> With permission it was added to the project.
> 
> |#
> 
> 
> 
> My project on Github is GPLv3.

Your implementation is more complete than mine, I only got around to fixing the 
things mentioned in this topic, not to implementing all the other stuff still 
waiting for me in the tutorial. As soon as that is done, I can do a comparison. 

Another interesting thing would be to see how many more column (if at all) it 
takes to make vectors the faster implementation.

I also need to take a look at some procedures you used in the code, which I am 
not familiar with yet, but which are standard Racket. There might be clever 
implementations behind those, which are better than what I would write with the 
set of things I know about Racket.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Philip McGrath
You probably want `integer-in` for the contract on `ext`.

Have you read about the special treatment of test submodules?
https://docs.racket-lang.org/guide/Module_Syntax.html#%28part._main-and-test%29

That is the idiomatic Racket way to write tests that run automatically when
you want them and don't get in the way when you don't. If you have
particularly extensive tests, it can also make sense to have a file where
the program consists of nothing but the test submodule.

-Philip

On Mon, Jul 24, 2017 at 5:50 PM, Alejandro Sanchez 
wrote:

>
> > On 24 Jul 2017, at 22:40, Jay McCarthy  wrote:
> >
> > On Mon, Jul 24, 2017 at 3:18 PM, Alejandro Sanchez
> >  wrote:
> >>> - I'm curious of the performance. In particular, I would expect that a
> >>> computed jump in unpack could do you good. Did you try that?
> >> I haven’t investigated performance yet. As I said, I am new to Racket,
> this is my first time doing anything useful in it, my only previous Scheme
> knowledge was from doing the exercises in SICP and dabbling in Guile a bit.
> What is a computed jump?
> >
> > Rather than having a big `cond`, you could look up the function that
> > does the work in a vector and then call it. IMHO, msgpack was designed
> > with that in mind, because tags that aren't immediate values are all
> > nicely ordered. So you'd check the size, subtract a constant, and grab
> > the appropriate procedure from a constant vector.
> OK, that’s the sort of thing I would have done in C where the tag would be
> an index into an array of function pointers. Can you please point me to
> where in the manual it explains how to profile a single function?
>
> >>> - Your package collection is 'multi, which is fine, but normally you
> >>> just do that when you're defining something like data/heap or
> >>> net/turkeyrpc, where you are extending some existing collection. In
> >>> particular, you define msgpack and then you also define the test/pack
> >>> collection (where you might expect it to be tests/msgpack/pack). I
> >>> recommend having your collection be "msgpack" and putting your tests
> >>> inside a tests sub-directory.
> >> Just to make sure I understood correctly: ‘msgpack’ is the umbrella
> module that users import, ‘msgpack/test/pack’ (and ‘unpack’) are the test
> modules that will be run for testing only. How about the directory
> structure? I like to keep all source files in a source directory (my
> original reason for doing ‘multi), can I still do something like this?
> >>
> >>|-README
> >>|-LICENSE
> >>|-info.rkt
> >>|-source
> >>  |-msgpack.rkt
> >>  |-pack.rkt
> >>  |-unpack.rkt
> >>|-test
> >>  |-pack.rkt
> >>  |-pack
> >>|- ...
> >>  |-unpack.rkt
> >>  |-unpack
> >>|- …
> >>
> >> It doesn’t have to be exactly this structure, but the idea is that all
> project-realted files are in the root, all the source files in the source
> directory and all the test files in the test directory.
> >
> > You can do that, but you'd have to have an additional `main.rkt` file
> > at the top-level that would require the things in `source` then
> > re-export them. It is not really Racket style to do what you're
> > talking about, however. If you did do that, then you could call the
> > `source` directory, `private` and then it would have a Racket-y name,
> > but your project isn't really large enough to warrant it and those
> > files aren't actually provide.
> >
> > Furthermore, your test/pack.rkt and test/unpack.rkt modules aren't
> > necessary, because you should be testing with `raco test -c msgpack`,
> > which will just go find everything. There's no need to build such
> > things yourself. (Although, FWIW, I also wouldn't have separate those
> > tests into such small files with just one or two, because they are,
> > again, so small.)
> I guess I’m weird that way, but I think of a project like a box. When you
> buy a thing and open the box you want all the contents to be neatly
> separated: here is the manual, here is the warranty card, here are the
> parts, all wrapped nicely in a bag. You wouldn’t want the contents to be
> loose and spill all over the floor. That’s why I like to separate the
> project
> into directories by functionality (documentation, source, tests, manuals,
> …). Oh well, if that is the Racket style I’ll do it your way.
>
>
> >>> - On a style level, I think you should remove your lets and turn your
> >>> if/begin blocks into conds, for example:
> >> Good point.
> >>
> >>>
> >>> On Mon, Jul 24, 2017 at 9:17 AM, Alejandro Sanchez
> >>>  wrote:
>  Hello dear Racketeers,
> 
>  I have been writing an implementation of the MessagePack protocol for
> Racket
>  and I think the library is ready to be showcased now:
> 
>  https://gitlab.com/HiPhish/MsgPack.rkt
> 
> 
>  ### What is MessagePack? ###
> 
>  MessagePack is a binary data 

Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Alejandro Sanchez

> On 24 Jul 2017, at 22:40, Jay McCarthy  wrote:
> 
> On Mon, Jul 24, 2017 at 3:18 PM, Alejandro Sanchez
>  wrote:
>>> - I'm curious of the performance. In particular, I would expect that a
>>> computed jump in unpack could do you good. Did you try that?
>> I haven’t investigated performance yet. As I said, I am new to Racket, this 
>> is my first time doing anything useful in it, my only previous Scheme 
>> knowledge was from doing the exercises in SICP and dabbling in Guile a bit. 
>> What is a computed jump?
> 
> Rather than having a big `cond`, you could look up the function that
> does the work in a vector and then call it. IMHO, msgpack was designed
> with that in mind, because tags that aren't immediate values are all
> nicely ordered. So you'd check the size, subtract a constant, and grab
> the appropriate procedure from a constant vector.
OK, that’s the sort of thing I would have done in C where the tag would be an 
index into an array of function pointers. Can you please point me to where in 
the manual it explains how to profile a single function?

>>> - Your package collection is 'multi, which is fine, but normally you
>>> just do that when you're defining something like data/heap or
>>> net/turkeyrpc, where you are extending some existing collection. In
>>> particular, you define msgpack and then you also define the test/pack
>>> collection (where you might expect it to be tests/msgpack/pack). I
>>> recommend having your collection be "msgpack" and putting your tests
>>> inside a tests sub-directory.
>> Just to make sure I understood correctly: ‘msgpack’ is the umbrella module 
>> that users import, ‘msgpack/test/pack’ (and ‘unpack’) are the test modules 
>> that will be run for testing only. How about the directory structure? I like 
>> to keep all source files in a source directory (my original reason for doing 
>> ‘multi), can I still do something like this?
>> 
>>|-README
>>|-LICENSE
>>|-info.rkt
>>|-source
>>  |-msgpack.rkt
>>  |-pack.rkt
>>  |-unpack.rkt
>>|-test
>>  |-pack.rkt
>>  |-pack
>>|- ...
>>  |-unpack.rkt
>>  |-unpack
>>|- …
>> 
>> It doesn’t have to be exactly this structure, but the idea is that all 
>> project-realted files are in the root, all the source files in the source 
>> directory and all the test files in the test directory.
> 
> You can do that, but you'd have to have an additional `main.rkt` file
> at the top-level that would require the things in `source` then
> re-export them. It is not really Racket style to do what you're
> talking about, however. If you did do that, then you could call the
> `source` directory, `private` and then it would have a Racket-y name,
> but your project isn't really large enough to warrant it and those
> files aren't actually provide.
> 
> Furthermore, your test/pack.rkt and test/unpack.rkt modules aren't
> necessary, because you should be testing with `raco test -c msgpack`,
> which will just go find everything. There's no need to build such
> things yourself. (Although, FWIW, I also wouldn't have separate those
> tests into such small files with just one or two, because they are,
> again, so small.)
I guess I’m weird that way, but I think of a project like a box. When you
buy a thing and open the box you want all the contents to be neatly
separated: here is the manual, here is the warranty card, here are the
parts, all wrapped nicely in a bag. You wouldn’t want the contents to be
loose and spill all over the floor. That’s why I like to separate the project
into directories by functionality (documentation, source, tests, manuals,
…). Oh well, if that is the Racket style I’ll do it your way.


>>> - On a style level, I think you should remove your lets and turn your
>>> if/begin blocks into conds, for example:
>> Good point.
>> 
>>> 
>>> On Mon, Jul 24, 2017 at 9:17 AM, Alejandro Sanchez
>>>  wrote:
 Hello dear Racketeers,
 
 I have been writing an implementation of the MessagePack protocol for 
 Racket
 and I think the library is ready to be showcased now:
 
 https://gitlab.com/HiPhish/MsgPack.rkt
 
 
 ### What is MessagePack? ###
 
 MessagePack is a binary data serialisation format. The website describes it
 "like JSON but fast and small". Unlike JSON the goal is not a format that's
 human-readable, but one that can be very quickly serialised, transported 
 and
 serialised.
 
 http://msgpack.org/
 
 
 ### About the Racket implementation ###
 
 My goal was to keep everything as simple as possible: there are only two
 functions: pack and unpack. If there is more than one way of packing an
 object
 the smallest format is selected automatically. Here is a taste:
 
  (require msgpack/pack msgpack/unpack)
  ;;; A wild hodgepodge to pack
  (define vec #(1 2 "hello" '(3 4) '() #t))

Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Alejandro Sanchez
I will look into that later, one thing at a time :)

> On 24 Jul 2017, at 21:45, Vincent St-Amour  
> wrote:
> 
> Hi Alejandro,
> 
> This looks cool!
> 
> I don't see it listed at pkgs.racket-lang.org. It would be easier for
> users to discover it if you posted it there.
> 
> Vincent
> 
> 
> 
> On Mon, 24 Jul 2017 08:17:30 -0500,
> Alejandro Sanchez wrote:
>> 
>> Hello dear Racketeers,
>> 
>> I have been writing an implementation of the MessagePack protocol for Racket
>> and I think the library is ready to be showcased now:
>> 
>> https://gitlab.com/HiPhish/MsgPack.rkt
>> 
>> ### What is MessagePack? ###
>> 
>> MessagePack is a binary data serialisation format. The website describes it
>> "like JSON but fast and small". Unlike JSON the goal is not a format that's
>> human-readable, but one that can be very quickly serialised, transported and
>> serialised.
>> 
>> http://msgpack.org/
>> 
>> ### About the Racket implementation ###
>> 
>> My goal was to keep everything as simple as possible: there are only two
>> functions: pack and unpack. If there is more than one way of packing an 
>> object
>> the smallest format is selected automatically. Here is a taste:
>> 
>> (require msgpack/pack msgpack/unpack)
>> ;;; A wild hodgepodge to pack
>> (define vec #(1 2 "hello" '(3 4) '() #t))
>> ;;; A byte string of packed data
>> (define packed
>> (call-with-output-bytes (λ (out) (pack vec out
>> ;;; Unpack the data again
>> (define upacked
>> (call-with-input-bytes packed (λ (in) (unpack in
>> 
>> As you can see, data is packed to and unpacked from a binary port. I think 
>> this
>> is better than packing/unpacking to binary string because MessagePack is
>> primarily used for inter-process communication, so there is not much point in
>> keeping the packed data inside a process.
>> 
>> I'd appreciate it a lot if a seasoned Racketeer could take a look at my code,
>> in particular if the library is set up properly (the info.rkt files), this is
>> my first time doing something in Racket. I am also open to suggestions about
>> the API, I haven't committed to version 1.0 yet. In particular, I am not
>> familiar with the modularity conventions of Racket libraries, i.e. if it is 
>> OK
>> to have 'msgpack/pack' and 'msgpack/unpack' or if everything should be 
>> covered
>> by one large 'provide' from 'msgpack'? There is one new type 'ext' declared,
>> should that be part of 'msgpack' or should I move it to 'msgpack/types'
>> instead?
>> 
>> On a related note, I find it really annoying that 'integer->integer-bytes' 
>> and
>> 'integer-bytes->integer' do not support 8-bit integers. Is there a reason for
>> that? I had to write all sorts of ugly extra code for the 8-bit cases. I 
>> opened
>> an issue on GitHub about it (#1754).
>> 
>> ### What's next? ###
>> 
>> Once the API settles I would like to move the library to typed Racket. I 
>> would
>> also like to submit it to the Racket packages catalog. The reason I wrote 
>> this
>> library is because I want to eventually write a Racket API client for Neovim:
>> 
>> https://github.com/neovim/neovim
>> https://github.com/neovim/neovim/wiki/Related-projects#api-clients
>> 
>> Neovim is a fork of Vim which aims to stay backwards compatible with Vim, but
>> at the same time bring the code base to modern standards, add long-wanted
>> features and make the editor easier to extend. They have already done a lot 
>> of
>> work, such asynchronous job control, a built-in terminal emulator, Lua
>> scripting and in particular a remote API.
>> 
>> The remote API allows one to write plugins in any language, provided there 
>> is a
>> client for that language. In contrast, Vim has to be compiled with support 
>> for
>> additional scripting languages and the integration burden was on the Vim
>> developers. This meant that popular languages like Python would be pretty 
>> well
>> supported, but more obscure languages were practically useless because no one
>> would re-compile their Vim just for one plugin. The remote API approach means
>> that Racket integration can be de-coupled from the editor development, and we
>> can write plugins that can make use of Racket libraries. One could for 
>> example
>> implement some of the DrRacket features using DrRacket as a library instead 
>> of
>> re-inventing the wheel. It would also be possible to integrate Neovim inside
>> DrRacket or write a Neovim GUI in Racket (GUIs are just very complex plugins 
>> in
>> Neovim).
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Racket Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to racket-users+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Daniel Prager
Hi Zelphir

Thanks for the attribution.

I'm running on a MacBook Air, 2012 vintage.

Why not run both my and your code on your machine and compare?

I made no optimisations other than assuming binary classification.


Dan

On Tue, Jul 25, 2017 at 6:46 AM, Zelphir Kaltstahl <
zelphirkaltst...@gmail.com> wrote:

>
> With my implementation of a list of vectors I only get down to:
>
> cpu time: 996 real time: 994 gc time: 52
>
> on my machine. Now I don't know what kind of machine you have, but I guess
> with such small data sets it does not matter that much and the list of
> lists implementation is faster, at least for low dimensional data :) It
> seems vectors involve a bit of overhead or you did some other optimization,
> which I still have to add to my code. (Maybe assuming binary class, but
> that should not make that much of a difference, I think. Might try that
> soon.)
>
> I added your code as a new file and added a comment at the top of the file:
>
> #|
> Attribution:
>
> This implementation of decision trees in Racket was written by Daniel
> Prager and
> was originally shared at:
>
> https://groups.google.com/forum/#!topic/racket-users/cPuTr8lrXCs
>
> With permission it was added to the project.
> |#
>
> My project on Github is GPLv3.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Mono, racketscript, ffi, databases and pointer swizzling

2017-07-24 Thread amz3
Héllo,

# Summary

I am slowly moving to Racket as my language of choice. My goal is to port the 
work I've done in GNU Guile and BiwaScheme to Racket and RacketScript [0]. I'd 
like to create a software stack I call "mono" that is a monolithic solution for 
doing  applications with a web GUI. By monolithic, I mean that component of the 
application must be in the same processus (in GNU/Linux parlance). This means 
that the application, as I designed it, can only scale vertically. It's only 
means to work on a single machine. FWIW, I'd like to cover most MVP needs on a 
single machine. It will require to rethink the architecture of the app to scale 
(which will prolly means going distributed, but there might be a middle ground).

[0] https://github.com/vishesh/racketscript

# Frontend 

Regarding the frontend side of the thing, I aiming at something like clojure 
"om" right now what I have is summarized in forward.scm [1]. It's 201 sloc of 
R6RS with some extensions to interop with javascript. I don't have an elegant 
solution for doing what is so called "isomorphic" and sometime even "universal" 
application. 

[1] https://github.com/amirouche/forward.scm 

Nowdays universal/isomorphic architectures are something useful to fix two 
things:

0) Speed of rendering the very first page which is silly and an early 
optimization.

1) SEO. Pure client side rendering is still not liked by most search engines.
Here I am aiming at the next generation of crawlers which *should* (and already 
can) support scraping pure client side applications (via headless browsers).

That said, there is workaround when working with something like forward.scm. 
You simply can render backend side the same thing rendered frontend side (and 
not share anycode if that's easier). I think that's a compromise one can make.

Truly universal/isomorphic webapps are an open problem. Even Hop.js [2] doesn't 
do it in a nice symbiotic way. 

[2] http://hop.inria.fr/home/index.html

Universal webapp should let the developer code in a network transparent way 
while taking into account:

- latency, because for instance drag'n'drop can not be fully handled backend 
side because you want the interaction to be snappy

- privileges, because for instance client side validation of data allows for a 
smooth UX but you can not consider that data safe when it reach the backend.

- processing power available. This is a more complicated issue.

That said, if you drop the latency problem. You can write a trivial framework 
that computes everything in the server and pay the bill!

tl;dr: Simply said, I only aim at porting 
https://github.com/amirouche/forward.scm to racketscript.

# Backend

Backend side things are more blurry.

What I am sure is that want an in-process database. I think it's difficult to 
craft a DSL that can do everything you want to do with a database because of 
the network. Maybe I am wrong. 

My initial goal was to implement something similar to datomic, but I was not 
sure its immutable property was useful. Now, I believe it ok. And with the help 
of minikanren [3] I think I can have something similar to datomic based on my 
previous work with wiredtiger database library [4].

By the way, do you recommend binding wiredtiger using typed racket?

[3] they are other solution in racket.
[4] http://hyperdev.fr/projects/wiredtiger/

That said, I am not sure it can scale as much as I want/need given racket 
thread model [5]. I will experiment.

[5] http://docs.racket-lang.org/reference/eval-model.html#(part._thread-model)

Anyway, I stumbled upon rscheme persistence [6] and though that it could change 
my approach of disk persistence. Also someone on the IRC channel told me about 
GemTalk Systems which apparently achieve transparent persistence and 
distribution. Are you aware of such work in racket? Can you recommend something 
to read regarding the subject? Is pointer swizzling a good start?

[6] http://www.rscheme.org/rs/a/2005/persistence/


amz3

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
On Monday, July 24, 2017 at 10:04:36 PM UTC+2, Daniel Prager wrote:
> Jon wrote:
> > Aside: if I read Daniel's solution correct, he avoids the first issue by 
> >assuming that it's a binary classification task (that is, that there are 
> >only two classes).
> 
> 
> Yep: I'm assuming binary classification.
> 
> 
> David wrote:
> > Out of curiosity, how much of that 0.5 seconds is overhead?  Could you run 
> >a simple 'add 1 and 1' procedure and see how long it takes?
> 
> 
> 
> I'm not exactly sure what you mean. Please feel free to profile however you 
> like on the supplied code.
> 
> My observation (primarily to Zelphir) on performance is that lists don't seem 
> like a bad choice for this algorithm.
> 
> If it hadn't been reasonably quick I might have tried replacing the dataset 
> (a list of lists) with a list of vectors, but otherwise I'd be looking at 
> modifying the exhaustive, greedy algorithm itself for possible speedups 
> rather than data structures.
> 
> Zelphir:
> > Maybe you could put it in a repository, so that other people are more 
> >likely to find your code.
> 
> If I ever get back into ML I might, but don't have the time to do a proper 
> write up.
> 
> Please feel free to include it in your github repository, with or without 
> attribution.
> 
> 
> Dan

With my implementation of a list of vectors I only get down to:

cpu time: 996 real time: 994 gc time: 52

on my machine. Now I don't know what kind of machine you have, but I guess with 
such small data sets it does not matter that much and the list of lists 
implementation is faster, at least for low dimensional data :) It seems vectors 
involve a bit of overhead or you did some other optimization, which I still 
have to add to my code. (Maybe assuming binary class, but that should not make 
that much of a difference, I think. Might try that soon.)

I added your code as a new file and added a comment at the top of the file:

#|
Attribution:

This implementation of decision trees in Racket was written by Daniel Prager and
was originally shared at:

https://groups.google.com/forum/#!topic/racket-users/cPuTr8lrXCs

With permission it was added to the project.
|#

My project on Github is GPLv3.



I think I implemented the suggestions made so far in this discussion, except 
memoization of columns. That could be another big time saver.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Jay McCarthy
On Mon, Jul 24, 2017 at 3:18 PM, Alejandro Sanchez
 wrote:
>> - I'm curious of the performance. In particular, I would expect that a
>> computed jump in unpack could do you good. Did you try that?
> I haven’t investigated performance yet. As I said, I am new to Racket, this 
> is my first time doing anything useful in it, my only previous Scheme 
> knowledge was from doing the exercises in SICP and dabbling in Guile a bit. 
> What is a computed jump?

Rather than having a big `cond`, you could look up the function that
does the work in a vector and then call it. IMHO, msgpack was designed
with that in mind, because tags that aren't immediate values are all
nicely ordered. So you'd check the size, subtract a constant, and grab
the appropriate procedure from a constant vector.

>> - Your package collection is 'multi, which is fine, but normally you
>> just do that when you're defining something like data/heap or
>> net/turkeyrpc, where you are extending some existing collection. In
>> particular, you define msgpack and then you also define the test/pack
>> collection (where you might expect it to be tests/msgpack/pack). I
>> recommend having your collection be "msgpack" and putting your tests
>> inside a tests sub-directory.
> Just to make sure I understood correctly: ‘msgpack’ is the umbrella module 
> that users import, ‘msgpack/test/pack’ (and ‘unpack’) are the test modules 
> that will be run for testing only. How about the directory structure? I like 
> to keep all source files in a source directory (my original reason for doing 
> ‘multi), can I still do something like this?
>
> |-README
> |-LICENSE
> |-info.rkt
> |-source
>   |-msgpack.rkt
>   |-pack.rkt
>   |-unpack.rkt
> |-test
>   |-pack.rkt
>   |-pack
> |- ...
>   |-unpack.rkt
>   |-unpack
> |- …
>
> It doesn’t have to be exactly this structure, but the idea is that all 
> project-realted files are in the root, all the source files in the source 
> directory and all the test files in the test directory.

You can do that, but you'd have to have an additional `main.rkt` file
at the top-level that would require the things in `source` then
re-export them. It is not really Racket style to do what you're
talking about, however. If you did do that, then you could call the
`source` directory, `private` and then it would have a Racket-y name,
but your project isn't really large enough to warrant it and those
files aren't actually provide.

Furthermore, your test/pack.rkt and test/unpack.rkt modules aren't
necessary, because you should be testing with `raco test -c msgpack`,
which will just go find everything. There's no need to build such
things yourself. (Although, FWIW, I also wouldn't have separate those
tests into such small files with just one or two, because they are,
again, so small.)

>> - On a style level, I think you should remove your lets and turn your
>> if/begin blocks into conds, for example:
> Good point.
>
>>
>> On Mon, Jul 24, 2017 at 9:17 AM, Alejandro Sanchez
>>  wrote:
>>> Hello dear Racketeers,
>>>
>>> I have been writing an implementation of the MessagePack protocol for Racket
>>> and I think the library is ready to be showcased now:
>>>
>>> https://gitlab.com/HiPhish/MsgPack.rkt
>>>
>>>
>>> ### What is MessagePack? ###
>>>
>>> MessagePack is a binary data serialisation format. The website describes it
>>> "like JSON but fast and small". Unlike JSON the goal is not a format that's
>>> human-readable, but one that can be very quickly serialised, transported and
>>> serialised.
>>>
>>> http://msgpack.org/
>>>
>>>
>>> ### About the Racket implementation ###
>>>
>>> My goal was to keep everything as simple as possible: there are only two
>>> functions: pack and unpack. If there is more than one way of packing an
>>> object
>>> the smallest format is selected automatically. Here is a taste:
>>>
>>>   (require msgpack/pack msgpack/unpack)
>>>   ;;; A wild hodgepodge to pack
>>>   (define vec #(1 2 "hello" '(3 4) '() #t))
>>>   ;;; A byte string of packed data
>>>   (define packed
>>> (call-with-output-bytes (λ (out) (pack vec out
>>>   ;;; Unpack the data again
>>>   (define upacked
>>> (call-with-input-bytes packed (λ (in) (unpack in
>>>
>>>
>>> As you can see, data is packed to and unpacked from a binary port. I think
>>> this
>>> is better than packing/unpacking to binary string because MessagePack is
>>> primarily used for inter-process communication, so there is not much point
>>> in
>>> keeping the packed data inside a process.
>>>
>>> I'd appreciate it a lot if a seasoned Racketeer could take a look at my
>>> code,
>>> in particular if the library is set up properly (the info.rkt files), this
>>> is
>>> my first time doing something in Racket. I am also open to suggestions about
>>> the API, I haven't committed to version 1.0 yet. In particular, I am not
>>> familiar with the modularity conventions of 

Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Jack Firth
> Just to make sure I understood correctly: ‘msgpack’ is the umbrella module 
> that users import, ‘msgpack/test/pack’ (and ‘unpack’) are the test modules 
> that will be run for testing only. How about the directory structure? I like 
> to keep all source files in a source directory (my original reason for doing 
> ‘multi), can I still do something like this?
> 
> |-README
> |-LICENSE
> |-info.rkt
> |-source
>   |-msgpack.rkt
>   |-pack.rkt
>   |-unpack.rkt
> |-test
>   |-pack.rkt
>   |-pack
> |- ...
>   |-unpack.rkt
>   |-unpack
> |- …
> 
> It doesn’t have to be exactly this structure, but the idea is that all 
> project-realted files are in the root, all the source files in the source 
> directory and all the test files in the test directory.

That you cannot do, and if you wish to do that keeping your code as one package 
might be a little unidiomatic. If you want to keep your test code completely 
separate from your implementation code, and you want both to be separate from 
the top level root of the project, you could have two separate packages each 
with a single collection like so:

|-README
|-LICENSE
|-msgpack-lib
  |-info.rkt ;; collection is "msgpack"
  |-main.rkt
  |-pack.rkt
  |-unpack.rkt
|-msgpack-test
  |-info.rkt ;; collection is "msgpack"
  |-pack-test.rkt
  |-unpack-test.rkt

Having said that, have you considered test submodules? They allow you to write 
your tests in the same file as the code they're testing, while keeping test 
dependencies separate from your library's normal runtime dependencies. With 
test submodules, your code would probably look like this:

(define (pack ...) ...)

(module+ test
  test pack ...)

(define (unpack ...) ...)

(module+ test
  test unpack ...)

And your directory structure would look like this:

|-README
|-LICENSE
|-info.rkt ;; collection is "msgpack"
|-main.rkt
|-pack.rkt ;; has test submodule
|-unpack.rkt ;; has test submodule

You could also put the code into a subdirectory package like above, if you 
really want to keep the project files and the source files separate.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Daniel Prager
Jon wrote:
> Aside: if I read Daniel's solution correct, he avoids the first issue by
assuming that it's a binary classification task (that is, that there are
only two classes).

Yep: I'm assuming binary classification.

David wrote:
> Out of curiosity, how much of that 0.5 seconds is overhead?  Could you
run a simple 'add 1 and 1' procedure and see how long it takes?

I'm not exactly sure what you mean. Please feel free to profile however you
like on the supplied code.

My observation (primarily to Zelphir) on performance is that lists don't
seem like a bad choice for this algorithm.

If it hadn't been reasonably quick I might have tried replacing the dataset
(a list of lists) with a list of vectors, but otherwise I'd be looking at
modifying the exhaustive, greedy algorithm itself for possible speedups
rather than data structures.

Zelphir:
> Maybe you could put it in a repository, so that other people are more
likely to find your code.

If I ever get back into ML I might, but don't have the time to do a proper
write up.

Please feel free to include it in your github repository, with or without
attribution.

Dan

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Vincent St-Amour
Hi Alejandro,

This looks cool!

I don't see it listed at pkgs.racket-lang.org. It would be easier for
users to discover it if you posted it there.

Vincent



On Mon, 24 Jul 2017 08:17:30 -0500,
Alejandro Sanchez wrote:
> 
> Hello dear Racketeers,
> 
> I have been writing an implementation of the MessagePack protocol for Racket
> and I think the library is ready to be showcased now:
> 
> https://gitlab.com/HiPhish/MsgPack.rkt
> 
> ### What is MessagePack? ###
> 
> MessagePack is a binary data serialisation format. The website describes it
> "like JSON but fast and small". Unlike JSON the goal is not a format that's
> human-readable, but one that can be very quickly serialised, transported and
> serialised.
> 
> http://msgpack.org/
> 
> ### About the Racket implementation ###
> 
> My goal was to keep everything as simple as possible: there are only two
> functions: pack and unpack. If there is more than one way of packing an object
> the smallest format is selected automatically. Here is a taste:
> 
> (require msgpack/pack msgpack/unpack)
> ;;; A wild hodgepodge to pack
> (define vec #(1 2 "hello" '(3 4) '() #t))
> ;;; A byte string of packed data
> (define packed
> (call-with-output-bytes (λ (out) (pack vec out
> ;;; Unpack the data again
> (define upacked
> (call-with-input-bytes packed (λ (in) (unpack in
> 
> As you can see, data is packed to and unpacked from a binary port. I think 
> this
> is better than packing/unpacking to binary string because MessagePack is
> primarily used for inter-process communication, so there is not much point in
> keeping the packed data inside a process.
> 
> I'd appreciate it a lot if a seasoned Racketeer could take a look at my code,
> in particular if the library is set up properly (the info.rkt files), this is
> my first time doing something in Racket. I am also open to suggestions about
> the API, I haven't committed to version 1.0 yet. In particular, I am not
> familiar with the modularity conventions of Racket libraries, i.e. if it is OK
> to have 'msgpack/pack' and 'msgpack/unpack' or if everything should be covered
> by one large 'provide' from 'msgpack'? There is one new type 'ext' declared,
> should that be part of 'msgpack' or should I move it to 'msgpack/types'
> instead?
> 
> On a related note, I find it really annoying that 'integer->integer-bytes' and
> 'integer-bytes->integer' do not support 8-bit integers. Is there a reason for
> that? I had to write all sorts of ugly extra code for the 8-bit cases. I 
> opened
> an issue on GitHub about it (#1754).
> 
> ### What's next? ###
> 
> Once the API settles I would like to move the library to typed Racket. I would
> also like to submit it to the Racket packages catalog. The reason I wrote this
> library is because I want to eventually write a Racket API client for Neovim:
> 
> https://github.com/neovim/neovim
> https://github.com/neovim/neovim/wiki/Related-projects#api-clients
> 
> Neovim is a fork of Vim which aims to stay backwards compatible with Vim, but
> at the same time bring the code base to modern standards, add long-wanted
> features and make the editor easier to extend. They have already done a lot of
> work, such asynchronous job control, a built-in terminal emulator, Lua
> scripting and in particular a remote API.
> 
> The remote API allows one to write plugins in any language, provided there is 
> a
> client for that language. In contrast, Vim has to be compiled with support for
> additional scripting languages and the integration burden was on the Vim
> developers. This meant that popular languages like Python would be pretty well
> supported, but more obscure languages were practically useless because no one
> would re-compile their Vim just for one plugin. The remote API approach means
> that Racket integration can be de-coupled from the editor development, and we
> can write plugins that can make use of Racket libraries. One could for example
> implement some of the DrRacket features using DrRacket as a library instead of
> re-inventing the wheel. It would also be possible to integrate Neovim inside
> DrRacket or write a Neovim GUI in Racket (GUIs are just very complex plugins 
> in
> Neovim).
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Alejandro Sanchez

> - I think you should have one module, `msgpack` that exports everything
OK. I my defense, I was originally planning to have a number of ‘pack-*’ 
methods like ‘pack-uin8’, ‘pack-uint16’ and so on, but there was nothing to be 
gained but a bloated interface.

> - You should rewrite your modules to use the `#lang racket/base`
> language so they don't force users to import so much stuff
Good idea, I’ll do that.

> - On integer->integer-bytes, I think that we should support 8-bit
> integers, but in the meantime, I think you should write your own
> version (integer->integer-bytes*) rather than having the handling in
> many places in the code.
Yes, I’ll do that eventually, I was just exhausted from getting it working and 
presentable first.

> - I'm curious of the performance. In particular, I would expect that a
> computed jump in unpack could do you good. Did you try that?
I haven’t investigated performance yet. As I said, I am new to Racket, this is 
my first time doing anything useful in it, my only previous Scheme knowledge 
was from doing the exercises in SICP and dabbling in Guile a bit. What is a 
computed jump?

> - Your package collection is 'multi, which is fine, but normally you
> just do that when you're defining something like data/heap or
> net/turkeyrpc, where you are extending some existing collection. In
> particular, you define msgpack and then you also define the test/pack
> collection (where you might expect it to be tests/msgpack/pack). I
> recommend having your collection be "msgpack" and putting your tests
> inside a tests sub-directory.
Just to make sure I understood correctly: ‘msgpack’ is the umbrella module that 
users import, ‘msgpack/test/pack’ (and ‘unpack’) are the test modules that will 
be run for testing only. How about the directory structure? I like to keep all 
source files in a source directory (my original reason for doing ‘multi), can I 
still do something like this?

|-README
|-LICENSE
|-info.rkt
|-source
  |-msgpack.rkt
  |-pack.rkt
  |-unpack.rkt
|-test
  |-pack.rkt
  |-pack
|- ...
  |-unpack.rkt
  |-unpack
|- …

It doesn’t have to be exactly this structure, but the idea is that all 
project-realted files are in the root, all the source files in the source 
directory and all the test files in the test directory.

> - On a style level, I think you should remove your lets and turn your
> if/begin blocks into conds, for example:
Good point.

> 
> On Mon, Jul 24, 2017 at 9:17 AM, Alejandro Sanchez
>  wrote:
>> Hello dear Racketeers,
>> 
>> I have been writing an implementation of the MessagePack protocol for Racket
>> and I think the library is ready to be showcased now:
>> 
>> https://gitlab.com/HiPhish/MsgPack.rkt
>> 
>> 
>> ### What is MessagePack? ###
>> 
>> MessagePack is a binary data serialisation format. The website describes it
>> "like JSON but fast and small". Unlike JSON the goal is not a format that's
>> human-readable, but one that can be very quickly serialised, transported and
>> serialised.
>> 
>> http://msgpack.org/
>> 
>> 
>> ### About the Racket implementation ###
>> 
>> My goal was to keep everything as simple as possible: there are only two
>> functions: pack and unpack. If there is more than one way of packing an
>> object
>> the smallest format is selected automatically. Here is a taste:
>> 
>>   (require msgpack/pack msgpack/unpack)
>>   ;;; A wild hodgepodge to pack
>>   (define vec #(1 2 "hello" '(3 4) '() #t))
>>   ;;; A byte string of packed data
>>   (define packed
>> (call-with-output-bytes (λ (out) (pack vec out
>>   ;;; Unpack the data again
>>   (define upacked
>> (call-with-input-bytes packed (λ (in) (unpack in
>> 
>> 
>> As you can see, data is packed to and unpacked from a binary port. I think
>> this
>> is better than packing/unpacking to binary string because MessagePack is
>> primarily used for inter-process communication, so there is not much point
>> in
>> keeping the packed data inside a process.
>> 
>> I'd appreciate it a lot if a seasoned Racketeer could take a look at my
>> code,
>> in particular if the library is set up properly (the info.rkt files), this
>> is
>> my first time doing something in Racket. I am also open to suggestions about
>> the API, I haven't committed to version 1.0 yet. In particular, I am not
>> familiar with the modularity conventions of Racket libraries, i.e. if it is
>> OK
>> to have 'msgpack/pack' and 'msgpack/unpack' or if everything should be
>> covered
>> by one large 'provide' from 'msgpack'? There is one new type 'ext' declared,
>> should that be part of 'msgpack' or should I move it to 'msgpack/types'
>> instead?
>> 
>> On a related note, I find it really annoying that 'integer->integer-bytes'
>> and
>> 'integer-bytes->integer' do not support 8-bit integers. Is there a reason
>> for
>> that? I had to write all sorts of ugly extra code for the 8-bit cases. I
>> opened
>> an issue on GitHub 

Re: [racket-users] Blame for contracts on applicable serializable structs

2017-07-24 Thread Robby Findler
One approach would be to not expect the clients to use deserialize
directly but provide a thin wrapper module which would be the place to
hang the blame information (and it would use `contract-out`).

Robby


On Mon, Jul 24, 2017 at 1:32 PM, Matthew Flatt  wrote:
> At Mon, 24 Jul 2017 12:35:51 -0500, Philip McGrath wrote:
>> I've also tried putting the
>> definition of `deserialize-info:adder-v0` in a different module, so that
>> its version of `adder` has a contract, but then the binding isn't seen by
>> `make-serialize-info`.
>
> In case you still want to pursue that direction, you can use the pair
> form of the second argument to `make-serialize-info`, which pairs a
> symbol with a reference to an exporting module. See the example below.
>
> (I think there's probably a library that's better to use than a raw
> `variable-reference->module-path-index` plus `module-path-index-join`,
> but I forget.)
>
> I don't see a way to blame the module that calls `deserialize`.
>
> 
>
> #lang racket
>
> (module server racket
>   (require racket/serialize)
>   (provide (contract-out
> [adder (-> natural-number/c (-> natural-number/c
> natural-number/c))]))
>   (struct adder (base)
> #:property prop:procedure
> (λ (this x)
>   (+ (adder-base this) x))
> #:property prop:serializable
> (make-serialize-info (λ (this) (vector (adder-base this)))
>  (cons 'deserialize-info:adder-v0
>(module-path-index-join
> '(submod "." deserialize-info)
> (variable-reference->module-path-index
>  (#%variable-reference
>  #f
>  (or (current-load-relative-directory)
>  (current-directory
>   (define/contract make-adder
> (-> natural-number/c (-> natural-number/c
>  natural-number/c))
> adder)
>
>   (module* deserialize-info racket/base
> (require (submod ".."))
> (require racket/serialize)
> (provide deserialize-info:adder-v0)
> (define deserialize-info:adder-v0
>   (make-deserialize-info adder
>  (λ () (error 'adder
>   "can't have cycles"))
>
>
> (require 'server racket/serialize)
>
> ((deserialize (serialize (adder 5))) 'not-a-number)
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Blame for contracts on applicable serializable structs

2017-07-24 Thread Matthew Flatt
At Mon, 24 Jul 2017 12:35:51 -0500, Philip McGrath wrote:
> I've also tried putting the
> definition of `deserialize-info:adder-v0` in a different module, so that
> its version of `adder` has a contract, but then the binding isn't seen by
> `make-serialize-info`.

In case you still want to pursue that direction, you can use the pair
form of the second argument to `make-serialize-info`, which pairs a
symbol with a reference to an exporting module. See the example below.

(I think there's probably a library that's better to use than a raw
`variable-reference->module-path-index` plus `module-path-index-join`, 
but I forget.)

I don't see a way to blame the module that calls `deserialize`.



#lang racket

(module server racket
  (require racket/serialize)
  (provide (contract-out
[adder (-> natural-number/c (-> natural-number/c
natural-number/c))]))
  (struct adder (base)
#:property prop:procedure
(λ (this x)
  (+ (adder-base this) x))
#:property prop:serializable
(make-serialize-info (λ (this) (vector (adder-base this)))
 (cons 'deserialize-info:adder-v0
   (module-path-index-join
'(submod "." deserialize-info)
(variable-reference->module-path-index
 (#%variable-reference
 #f
 (or (current-load-relative-directory)
 (current-directory
  (define/contract make-adder
(-> natural-number/c (-> natural-number/c
 natural-number/c))
adder)
  
  (module* deserialize-info racket/base
(require (submod ".."))
(require racket/serialize)
(provide deserialize-info:adder-v0)
(define deserialize-info:adder-v0
  (make-deserialize-info adder
 (λ () (error 'adder
  "can't have cycles"))


(require 'server racket/serialize)

((deserialize (serialize (adder 5))) 'not-a-number)

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Blame for contracts on applicable serializable structs

2017-07-24 Thread Matthias Felleisen

Ouch, my fault. I deleted the wrong bindings. 


> On Jul 24, 2017, at 1:58 PM, Philip McGrath  wrote:
> 
> Sorry for the crossed emails. If I understand what's happening correctly, the 
> code you sent only blames the "deserializing" module because it shaddows 
> `serialize` and `deserialize` to mean `values`, so the instance is never 
> actually deserialized or serialized and `deserialize-info:adder-v0` is never 
> consulted. If I remove the `local` block, the error is displayed in terms of 
> `+`.
> 
> #lang racket
> 
> (module server racket
>   (require racket/serialize)
>   
>   (provide (contract-out
> [adder (-> natural-number/c (-> natural-number/c 
> natural-number/c))]))
>   
>   (struct adder (base)
> #:property prop:procedure
> (λ (this x) (+ (adder-base this) x))
> #:property prop:serializable
> (make-serialize-info (λ (this) (vector (adder-base this)))
>  #'deserialize-info:adder-v0
>  #f
>  (or (current-load-relative-directory)
>  (current-directory
>   
>   (define deserialize-info:adder-v0
> (make-deserialize-info adder (λ () (error 'adder "can't have cycles"
>   
>   (module+ deserialize-info
> (provide deserialize-info:adder-v0)))
>   
> (require (submod "." server) racket/serialize)
> 
> (define x (serialize (adder 5)))
> (displayln x)
> (define f (deserialize x))
> (f 'not-a-number) 
> 
> -Philip
> 
> On Mon, Jul 24, 2017 at 12:50 PM, Philip McGrath  
> wrote:
> It occurs to me that one approach is to use the low-level `contract` form 
> directly. It gives better blame than `define/contract`, at least. The program:
> 
> #lang racket
> 
> (module server racket
>   (require racket/serialize)
>   
>   (provide (contract-out
> [adder (-> natural-number/c (-> natural-number/c 
> natural-number/c))]))
>   
>   (struct adder (base)
> #:property prop:procedure
> (λ (this x) (+ (adder-base this) x))
> #:property prop:serializable
> (make-serialize-info (λ (this) (vector (adder-base this)))
>  #'deserialize-info:adder-v0
>  #f
>  (or (current-load-relative-directory)
>  (current-directory
>   
>   (define deserialize-info:adder-v0
> (make-deserialize-info
>  (λ (base)
>(define inst (adder base))
>(contract (-> natural-number/c natural-number/c)
>  inst
>  '(definition adder)
>  `(deserialization ,inst)
>  (object-name adder)
>  #f))
>  (λ () (error 'adder "can't have cycles"
>   
>   (module+ deserialize-info
> (provide deserialize-info:adder-v0)))
>   
> (require (submod "." server) racket/serialize)
> 
> ((deserialize (serialize (adder 5))) 'not-a-number)
> 
> reports the error:
> 
> adder: contract violation
>   expected: natural-number/c
>   given: 'not-a-number
>   in: the 1st argument of
>   (-> natural-number/c natural-number/c)
>   contract from: (definition adder)
>   blaming: (deserialization #)
>(assuming the contract is correct)
> 
> 
> -Philip
> 
> On Mon, Jul 24, 2017 at 12:35 PM, Philip McGrath  
> wrote:
> That is precisely the contract violation I'd like to see reported, but, 
> without the shadowing of serialize and deserialize, the error is reported in 
> terms of `+`. (And, if it wasn't clear, I do intend to actually read and 
> write the serialized instance.)
> 
> I (think) I understand why deserialization strips the contract from the 
> instance: the contract is added at the module boundary using the 
> chaperone/impersonator infrastructure, and deserialization uses the 
> unprotected form of `adder` passed to `make-deserialize-info` within the 
> server module. 
> 
> What I don't understand is how to give `make-deserialize-info` a function 
> that (1) has a contract where (2) fulfilling the range part of the contract 
> becomes the deserializing module's obligation — if such a thing is possible. 
> Aside from attempts with `define/contract` (which as I now understand 
> achieved point 1 but not point 2), I've also tried putting the definition of 
> `deserialize-info:adder-v0` in a different module, so that its version of 
> `adder` has a contract, but then the binding isn't seen by 
> `make-serialize-info`.
> 
> -Philip
> 
> On Mon, Jul 24, 2017 at 7:30 AM, Matthias Felleisen  
> wrote:
> 
>> On Jul 23, 2017, at 10:50 PM, Philip McGrath  
>> wrote:
>> 
>> If I'm following correctly, I think that's what I was trying to do, but I'm 
>> unclear how to give `make-deserialize-info` a variant of `make-adder` that 
>> has a contract. The initial example with `define/contract` was the closest 
>> I've come: it at least reported violations in terms of `make-adder` 

Re: [racket-users] Positional arguments in syntax classes

2017-07-24 Thread Sam Waxman
Sorry, I think I need to elaborate a bit more!

I've written code as follows

(define-syntax-class my-id
  (pattern id
#:do (if (identifier? #'id) (values) 
 (raise-user-error (~a "Expected id: " (syntax->datum #'id))

(define-syntax (my-let stx)
  (syntax-parse stx
[(_ ([id:my-id binding] ...) body ... last-body)
 #'(let ([id binding] ...) body ... last-body)]))

(define-syntax (reassign stx)
  (syntax-parse stx
[(_ id:my-id new-value)
  #'(set! id new-value)]))

... other syntax rules using my-id as a syntax class

I enjoy using syntax classes to check that my arguments are id's, but what I 
don't enjoy is how limited my error message is. (I don't want to use Rackets. 
The end users for this are students using very simple self-defined languages, 
and the error messages should be very intuitive, so my-id
is just id but with my own error.)

What I'd like to do is format my errors more like
"let: Expected an identifier to bind but got 3"
"reassign: Expected an identifier to reassign but got 3."

The way I planned to do this was to pass arguments into my syntax class. In the 
racket documentation, it states this is possible. The forms of 
define-syntax-class are as follows.

(define-syntax-class name-id stxclass-option ...
  stxclass-variant ...+)
(define-syntax-class (name-id . kw-formals) stxclass-option ...
  stxclass-variant ...+)

I'm hoping to use the second form to write something like the following

(define-syntax-class (my-id name description)
  (pattern id
#:do (if (identifier? #'id) (values) 
 (raise-syntax-error name 
 (~a " expected " description but got (syntax->datum #'id))

This way I could write my let as

(define-syntax (my-let stx)
  (syntax-parse stx
[(_ ([id:(my-id 'let "an identifier to bind") binding] ...) body ... 
last-body)
 #'(let ([id binding] ...) body ... last-body)]))

(or to make this look cleaner, I could have a line of code before this that 
says (define let-id (my-id 'let "an identifier to bind")) I don't think that 
would exactly work, but I'm sure it's doable in some way)

The only thing I'm having problems with is that apparently that isn't the right 
way to use the formals. When I try writing it the original way but with the 
updated my-id version, I get 

id:my-id: expected 1 positional argument; got 0 positional arguments in: 
id:my-id

and when I write it with the parentheses like I did in the last example I get 

my-let: expected more terms at: () within: (my-let ([x 2]) 1) in: (my-let ([x 
2]) 1)

On Monday, July 24, 2017 at 1:19:06 PM UTC-4, Alex Knauth wrote:
> On Jul 24, 2017, at 1:06 PM, Sam Waxman  wrote:
> 
> 
> Probably a silly question but I can't figure out how to get this working.
> 
> I want to have a syntax class that I can pass arguments into. So,
> 
> (define-syntax-class (my-syn-class argument)
>   ...
> )
> 
> (syntax-parse #'some-syntax
>   [(_ x:**)])
> 
> 
> In place of ** what do I write to pass the argument into my syntax class?
> I tried 
> 
> (syntax-parse #'some-syntax
>   [(_ x:(my-id *my-argument*)])
> 
> but this doesn't appear to work.
> 
> Thanks in advance!
> 
> 
> You have to use either ~var [1] or #:declare [2].
> 
> 
>   [1]: 
> http://docs.racket-lang.org/syntax/stxparse-patterns.html#%28elem._%28pattern-link._%28~7evar._s%2B%29%29%29
>   [2]: 
> http://docs.racket-lang.org/syntax/stxparse-specifying.html#%28part._.Pattern_.Directives%29

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Blame for contracts on applicable serializable structs

2017-07-24 Thread Philip McGrath
That is precisely the contract violation I'd like to see reported, but,
without the shadowing of serialize and deserialize, the error is reported
in terms of `+`. (And, if it wasn't clear, I do intend to actually read and
write the serialized instance.)

I (think) I understand why deserialization strips the contract from the
instance: the contract is added at the module boundary using the
chaperone/impersonator infrastructure, and deserialization uses the
unprotected form of `adder` passed to `make-deserialize-info` within the
server module.

What I don't understand is how to give `make-deserialize-info` a function
that (1) has a contract where (2) fulfilling the range part of the contract
becomes the deserializing module's obligation — if such a thing is
possible. Aside from attempts with `define/contract` (which as I now
understand achieved point 1 but not point 2), I've also tried putting the
definition of `deserialize-info:adder-v0` in a different module, so that
its version of `adder` has a contract, but then the binding isn't seen by
`make-serialize-info`.

-Philip

On Mon, Jul 24, 2017 at 7:30 AM, Matthias Felleisen 
wrote:

>
> On Jul 23, 2017, at 10:50 PM, Philip McGrath 
> wrote:
>
> If I'm following correctly, I think that's what I was trying to do, but
> I'm unclear how to give `make-deserialize-info` a variant of `make-adder`
> that has a contract. The initial example with `define/contract` was the
> closest I've come: it at least reported violations in terms of `make-adder`
> rather than `+`, but (as I now understand) it blamed the `server` module
> for all violations.
>
> -Philip
>
> On Sun, Jul 23, 2017 at 9:27 PM, Matthew Flatt  wrote:
>
>> The original example had an explicit deserializer:
>>
>> At Sun, 23 Jul 2017 19:54:43 -0500, Philip McGrath wrote:
>> >   (define deserialize-info:adder-v0
>> > (make-deserialize-info make-adder
>> >(λ () (error 'adder
>> > "can't have cycles"
>>
>> You're constructing the deserializer with `make-adder` --- the variant
>> from inside the `server` module, so it doesn't have a contract.
>>
>> I think this is where you want to draw a new boundary by giving
>> `make-deserialize-info` a variant of `make-adder` that has a contract.
>
>
>
>
> Don’t you just want this:
>
> #lang racket
>
> (module server racket
>   (require racket/serialize)
>
>   (provide (contract-out
> [adder (-> natural-number/c (-> natural-number/c
> natural-number/c))]))
>
>   (struct adder (base)
> #:property prop:procedure
> (λ (this x) (+ (adder-base this) x))
> #:property prop:serializable
> (make-serialize-info (λ (this) (vector (adder-base this)))
>  #'deserialize-info:adder-v0
>  #f
>  (or (current-load-relative-directory)
>  (current-directory
>
>   (define deserialize-info:adder-v0
> (make-deserialize-info adder (λ () (error 'adder "can't have
> cycles"
>
>   (module+ deserialize-info
> (provide deserialize-info:adder-v0)))
>
> (require (submod "." server) racket/serialize)
>
> (local ((define serialize values)
> (define deserialize values))
>   (define x (serialize (adder 5)))
>   (define f (deserialize x))
>   (f 'not-a-number))
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Jay McCarthy
This is awesome, thank you!

Some advice:
- I think you should have one module, `msgpack` that exports everything
- You should rewrite your modules to use the `#lang racket/base`
language so they don't force users to import so much stuff
- On integer->integer-bytes, I think that we should support 8-bit
integers, but in the meantime, I think you should write your own
version (integer->integer-bytes*) rather than having the handling in
many places in the code.
- I'm curious of the performance. In particular, I would expect that a
computed jump in unpack could do you good. Did you try that?
- Your package collection is 'multi, which is fine, but normally you
just do that when you're defining something like data/heap or
net/turkeyrpc, where you are extending some existing collection. In
particular, you define msgpack and then you also define the test/pack
collection (where you might expect it to be tests/msgpack/pack). I
recommend having your collection be "msgpack" and putting your tests
inside a tests sub-directory.
- On a style level, I think you should remove your lets and turn your
if/begin blocks into conds, for example:

You have

```
(define (pack-map m out)
  (let ([len (hash-count m)])
(if (<= len #b)
  (write-byte (bitwise-ior len #b1000) out)
  (begin
(cond
  [(uint16? len) (write-byte #xDE out)]
  [(uint32? len) (write-byte #xDF out)]
  [else (error "An map may contain at most 2^32 - 1 items")])
(write-bytes (integer->bytes len #f) out)))
(for ([(key value) (in-hash m)])
  (pack key   out)
  (pack value out
```

But I would write:

```
(define (pack-map m out)
  (define len (hash-count m))
  (cond
[(<= len #b)
 (write-byte (bitwise-ior len #b1000) out)]
[else
 (cond
   [(uint16? len) (write-byte #xDE out)]
   [(uint32? len) (write-byte #xDF out)]
   [else (error "An map may contain at most 2^32 - 1 items")])
 (write-bytes (integer->bytes len #f) out)])
  (for ([(key value) (in-hash m)])
(pack key   out)
(pack value out)))
```

On Mon, Jul 24, 2017 at 9:17 AM, Alejandro Sanchez
 wrote:
> Hello dear Racketeers,
>
> I have been writing an implementation of the MessagePack protocol for Racket
> and I think the library is ready to be showcased now:
>
> https://gitlab.com/HiPhish/MsgPack.rkt
>
>
> ### What is MessagePack? ###
>
> MessagePack is a binary data serialisation format. The website describes it
> "like JSON but fast and small". Unlike JSON the goal is not a format that's
> human-readable, but one that can be very quickly serialised, transported and
> serialised.
>
> http://msgpack.org/
>
>
> ### About the Racket implementation ###
>
> My goal was to keep everything as simple as possible: there are only two
> functions: pack and unpack. If there is more than one way of packing an
> object
> the smallest format is selected automatically. Here is a taste:
>
>(require msgpack/pack msgpack/unpack)
>;;; A wild hodgepodge to pack
>(define vec #(1 2 "hello" '(3 4) '() #t))
>;;; A byte string of packed data
>(define packed
>  (call-with-output-bytes (λ (out) (pack vec out
>;;; Unpack the data again
>(define upacked
>  (call-with-input-bytes packed (λ (in) (unpack in
>
>
> As you can see, data is packed to and unpacked from a binary port. I think
> this
> is better than packing/unpacking to binary string because MessagePack is
> primarily used for inter-process communication, so there is not much point
> in
> keeping the packed data inside a process.
>
> I'd appreciate it a lot if a seasoned Racketeer could take a look at my
> code,
> in particular if the library is set up properly (the info.rkt files), this
> is
> my first time doing something in Racket. I am also open to suggestions about
> the API, I haven't committed to version 1.0 yet. In particular, I am not
> familiar with the modularity conventions of Racket libraries, i.e. if it is
> OK
> to have 'msgpack/pack' and 'msgpack/unpack' or if everything should be
> covered
> by one large 'provide' from 'msgpack'? There is one new type 'ext' declared,
> should that be part of 'msgpack' or should I move it to 'msgpack/types'
> instead?
>
> On a related note, I find it really annoying that 'integer->integer-bytes'
> and
> 'integer-bytes->integer' do not support 8-bit integers. Is there a reason
> for
> that? I had to write all sorts of ugly extra code for the 8-bit cases. I
> opened
> an issue on GitHub about it (#1754).
>
>
> ### What's next? ###
>
> Once the API settles I would like to move the library to typed Racket. I
> would
> also like to submit it to the Racket packages catalog. The reason I wrote
> this
> library is because I want to eventually write a Racket API client for
> Neovim:
>
> https://github.com/neovim/neovim
> https://github.com/neovim/neovim/wiki/Related-projects#api-clients
>
> Neovim is a fork of Vim which aims to stay backwards 

Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
On Monday, July 24, 2017 at 6:44:23 PM UTC+2, Matthias Felleisen wrote:
> > On Jul 24, 2017, at 12:37 PM, Zelphir Kaltstahl 
> >  wrote:
> > 
> > In general I think available libraries are important for creating 
> > attraction to a programming language. For example I like Python's machine 
> > learning and data munging libraries and this is one reason why I frequently 
> > use them. It would be great to have something equally great in Racket, even 
> > if experienced Racket programmers can code something up in a day or two.
> 
> 
> Yes! Develop a package for ML in Racket and people will use it!

I think it will take a long time to get there, but by writing a few algorithms 
and trying to make them uniformly accessible, I think I can get some basis 
going. Will see how far my motivation takes me ;) It is also a learning 
experience for me. I've implemented decision trees and linear regression in 
Python before and some other algorithms half, in some online courses, but I 
think there are many things I did not write yet and still need to even hear 
about.

I'll make all my ML code freely available on Github probably and will see what 
becomes of it.

I don't have any experience with packaging in Racket yet, but I think that is 
something for later, when I actually get some stuff working and the quality of 
code is anything I'd like to show someone. (not like the state the code is in 
right now, with all the issues, which were pointed out!)

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Positional arguments in syntax classes

2017-07-24 Thread Sam Waxman
Probably a silly question but I can't figure out how to get this working.

I want to have a syntax class that I can pass arguments into. So,

(define-syntax-class (my-syn-class argument)
  ...
)

(syntax-parse #'some-syntax
  [(_ x:**)])


In place of ** what do I write to pass the argument into my syntax class?
I tried 

(syntax-parse #'some-syntax
  [(_ x:(my-id *my-argument*)])

but this doesn't appear to work.

Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Matthias Felleisen

> On Jul 24, 2017, at 12:37 PM, Zelphir Kaltstahl  
> wrote:
> 
> In general I think available libraries are important for creating attraction 
> to a programming language. For example I like Python's machine learning and 
> data munging libraries and this is one reason why I frequently use them. It 
> would be great to have something equally great in Racket, even if experienced 
> Racket programmers can code something up in a day or two.


Yes! Develop a package for ML in Racket and people will use it! 

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
Already fixed that `class-labels` issue, thanks, it was a quick fix today 
morning. I already tested it with that change and it was way faster.

I thought about going for binary classification only while programming, but I 
think I managed to keep it generic for multiple classes at least in most of the 
code.
Maybe that is not necessary, because every n classes classification problem can 
be expressed with n binary classifications ("in a class or in one of the 
others", for each of the classes).
I think however not assuming 2 classes does not impact performance in this case.

If there are not only binary splits, but three parts split for example, the 
algorithm becomes less efficient.
The reason is, that for each split value, one would have to check again all the 
other split values of a feature and that would be O(n^2), I believe, instead of 
O(n). So maybe not assuming binary splits just makes things harder on myself, 
because no one is going to do three part splits or even more.

Thanks for your suggestions.

@Daniel Prager:

Thanks for posting that code, I might relate to it when getting there. Seems 
concise. Maybe you could put it in a repository, so that other people are more 
likely to find your code.

In general I think available libraries are important for creating attraction to 
a programming language. For example I like Python's machine learning and data 
munging libraries and this is one reason why I frequently use them. It would be 
great to have something equally great in Racket, even if experienced Racket 
programmers can code something up in a day or two.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Jon Zeppieri
Just want to emphasize that the main source of inefficiency in your code is 
what I mentioned in my last message (iterating over the class labels of each 
row instead of the unique class labels of the entire data set). The second 
biggest factor is your structural recursion over a non-recursive data type 
(namely, vectors). Everything else, from a performance perspective, is 
insignificant.

Aside: if I read Daniel's solution correct, he avoids the first issue by 
assuming that it's a binary classification task (that is, that there are only 
two classes).

> On Jul 24, 2017, at 4:08 AM, Zelphir Kaltstahl  
> wrote:
> 
> Wow, thanks for all the feedback, I'll try to get most of the mentioned stuff 
> done and then post an update : )
> 
>> I teach trees and decision trees to freshman students who have never 
>> programmed before, and Racket’s forms of data and functions are extremely 
>> suitable to this domain.
> 
> I think this is relating to using vectors and accessing them by index? How 
> would you represent the data? What forms of data are better suited?
> 
> Mentioned were:
> 
> - struct instead of hash
> - list of vectors instead of vector of vectors
> 
> Your post made me think of functions themselves. Would it be possible to 
> represent the splits as chains of functions and would that have any 
> advantage? Is that what you are hinting at?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] How do db handles interact with threads and parameters

2017-07-24 Thread Ryan Culpepper

On 7/24/17 9:11 AM, George Neuner wrote:

Hi David,

On 7/24/2017 8:18 AM, David Storrs wrote:

What happens in the following code?

(define dbh (postgresql-connect ...))

;; Use the DBH in a new thread
(thread (thunk
  (while ...some long-running condition...
(sleep 1) ; don't flood the DB
(query-exec dbh "insert into users ..."

;; And in the main thread
(query-exec dbh ...)

I now have a database object that's being shared between two threads,
yes?  Or is the object copied when the new thread is created and, if
so, what will that do to the underlying connection to the DB?


The single DBMS connection is being shared by the 2 threads.  That's a
recipe for disaster if both try to use it simultaneously.


To clarify "disaster": Connections are thread-safe, so the queries 
performed by the 2 threads will be interleaved in some order, which is 
fine if both threads are just doing reads. But the threads are not 
isolated ("session-safe" or "conversation-safe"?). For example, if one 
thread changes the time zone, it affects queries made by the other 
thread. If one thread starts and later rolls back a transaction, it 
might undo modifications made by the other thread. And so on.


Ryan

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread David Storrs
Out of curiosity, how much of that 0.5 seconds is overhead?  Could you run
a simple 'add 1 and 1' procedure and see how long it takes?

On Mon, Jul 24, 2017 at 9:18 AM, Daniel Prager 
wrote:

> Hi Zelphir
>
> Thank-you for a fun exercise!
>
> My implementation using straight lists runs in around 0.5 second in
> DrRacket on my 2012 Macbook.
>
> Sample output (model construction time + model + accuracy):
>
> cpu time: 507 real time: 561 gc time: 60
>
> '(0
>   1.9157
>   (1 9.1772 1 (0 -1.3612 0 0))
>   (0 2.0597 0 (0 3.1219 0 (0 4.1665 0 (0 4.8278 0 0)
>
> 0.8205828779599271
>
> Apologies for the lack of comments.
>
> Dan
>
>
> #lang racket
>
> (define (string->data s [sep " "])
>   (for/list ([line (string-split s "\n")])
> (map string->number (string-split line sep
>
> (define banknote-data
>   (string->data (file->string "data_banknote_authentication.txt") ","))
>
> (define test-data
>   (string->data
>"2.771244718 1.784783929 0
> 1.728571309 1.169761413 0
> 3.678319846 2.81281357 0
> 3.961043357 2.61995032 0
> 2.999208922 2.209014212 0
> 7.497545867 3.162953546 1
> 9.00220326 3.339047188 1
> 7.444542326 0.476683375 1
> 10.12493903 3.234550982 1
> 6.642287351 3.319983761 1"))
>
> (define (make-split rows index value)
>   (define-values (left right)
> (for/fold ([left null] [right null])
>   ([row rows])
>   (if (< (list-ref row index) value)
>   (values (cons row left) right)
>   (values left (cons row right)
>   (list left right))
>
> (define (gini-coefficient splits)
>   (for/sum ([split splits])
> (define n (* 1.0 (length split)))
> (define (g v) (* (/ v n) (- 1.0 (/ v n
> (if (zero? n)
> 0
> (let ([m (for/sum ([row split] #:when (zero? (last row)))
>1)])
>   (+ (g m) (g (- n m)))
>
> (define (get-split rows)
>   (define-values (best index value _)
> (for*/fold ([best null] [i -1] [v -1] [score 999])
>([index (in-range (sub1 (length (first rows]
> [row rows])
>   (let* ([value (list-ref row index)]
>  [s (make-split rows index value)]
>  [gini (gini-coefficient s)])
> (if (< gini score)
> (values s index value gini)
> (values best i v score)
>   (list index value best))
>
> (define (to-terminal group)
>   (define zeros (count (λ (row) (zero? (last row))) group))
>   (if (> zeros (- (length group) zeros)) 0 1))
>
> (define (split node max-depth min-size depth)
>   (match-define (list index value (list left right)) node)
>   (define (split-if-small branch)
> (if (<= (length branch) min-size)
> (to-terminal branch)
> (split (get-split branch) max-depth min-size (add1 depth
>   (cond [(null? left) (to-terminal right)]
> [(null? right) (to-terminal left)]
> [(>= depth max-depth) (list index value
> (to-terminal left) (to-terminal
> right))]
> [else (list index value
> (split-if-small left) (split-if-small right))]))
>
> (define (build-tree rows max-depth min-size)
>   (split (get-split rows) max-depth min-size 1))
>
> (define (predict node row)
>   (if (list? node)
>   (match-let ([(list index value left right) node])
> (predict (if (< (list-ref row index) value)
>  left
>  right)
>  row))
>   node))
>
> (define (check-model model validation-set)
>   (/ (count (λ (row) (= (predict model row) (last row)))
> validation-set)
>  (length validation-set)
>  1.0))
>
> ;(define test-model (build-tree test-data 1 1))
> ;(for/list ([row test-data])
> ;  (list row (predict test-model row)))
>
> (define data (shuffle banknote-data))
> (define model (time (build-tree (take data 274) 5 10)))
>
> model
>
> (check-model model (drop data 274))
>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] How do db handles interact with threads and parameters

2017-07-24 Thread David Storrs
On Mon, Jul 24, 2017 at 9:11 AM, George Neuner  wrote:

> Hi David,
>
> On 7/24/2017 8:18 AM, David Storrs wrote:
>
> What happens in the following code?
>
> (define dbh (postgresql-connect ...))
>
> ;; Use the DBH in a new thread
> (thread (thunk
>   (while ...some long-running condition...
> (sleep 1) ; don't flood the DB
> (query-exec dbh "insert into users ..."
>
> ;; And in the main thread
> (query-exec dbh ...)
>
> I now have a database object that's being shared between two threads,
> yes?  Or is the object copied when the new thread is created and, if so,
> what will that do to the underlying connection to the DB?
>
>
> The single DBMS connection is being shared by the 2 threads.  That's a
> recipe for disaster if both try to use it simultaneously.
>
>
> New scenario:
>
> (define dbh (postgresql-connect ...))
> (define current-dbh (make-parameter dbh))
> (query-value (current-dbh) "select ...")
> (sleep 100)
> (query-value (current-dbh) "select ...")
>
> I would expect the first query-value to work but by the time the second
> runs the handle has probably been disconnected by the database and the
> query will throw an exception.  Am I understanding that properly?
>
>
> After sleeping 11.6 days?  Probably.   But it depends on the connection
> keep-alive settings.  By default, Postgresql uses the OS settings, which
> are roughly ~130 minutes assuming that the client understands and
> participates [which Racket does].  However, on most systems, keep-alive
> timeouts can be set far higher.
>
> Aside: I'm not aware of any published limits on keep-alive settings, only
> the defaults are known for various systems.  If the settings are just
> integral seconds and counts, and (theoretically) could go to MAX_INT, that
> would represent  ~3^20 years on a 32-bit machine ... for all practical
> purposes, infinite.
>
>
> Final scenario, identical to the previous except that the parameter is a
> promise:
> (define dbh (postgresql-connect ...))
> (define current-dbh (make-parameter (delay dbh)))
> (query-value (force (current-dbh)) "select ...")
> (sleep 100)
> (query-value (force (current-dbh)) "select ...")
>
> Would this make any difference?  I can't see why but thought I should ask.
>
>
> There's no difference wrt the above.  You aren't delaying the *opening* of
> the connection - you are simply performing unnecessary gyrations to get at
> the handle.   There would be a difference if instead you wrote:
>
>(define dbh (delay (postgresql-connect ...)))
>(define current-dbh (make-parameter dbh))
>(query-value (force (current-dbh)) "select ...")
>(sleep ...)
>(query-value (force (current-dbh)) "select ...")
>
> which would try to open the connection at the point of each query.
> However, at the 2nd query, if the connection were still open due to
> keep-alive, the repeated open attempt might fail.
>
>
> George
>

Okay, good.  That's all as I expected.  Thanks, George.


> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] [ANN] MessagePack implementation for Racke

2017-07-24 Thread Alejandro Sanchez
Hello dear Racketeers,

I have been writing an implementation of the MessagePack protocol for Racket
and I think the library is ready to be showcased now:

https://gitlab.com/HiPhish/MsgPack.rkt 


### What is MessagePack? ###

MessagePack is a binary data serialisation format. The website describes it
"like JSON but fast and small". Unlike JSON the goal is not a format that's
human-readable, but one that can be very quickly serialised, transported and
serialised.

http://msgpack.org/ 


### About the Racket implementation ###

My goal was to keep everything as simple as possible: there are only two
functions: pack and unpack. If there is more than one way of packing an object
the smallest format is selected automatically. Here is a taste:

   (require msgpack/pack msgpack/unpack)
   ;;; A wild hodgepodge to pack
   (define vec #(1 2 "hello" '(3 4) '() #t))
   ;;; A byte string of packed data
   (define packed
 (call-with-output-bytes (λ (out) (pack vec out
   ;;; Unpack the data again
   (define upacked
 (call-with-input-bytes packed (λ (in) (unpack in


As you can see, data is packed to and unpacked from a binary port. I think this
is better than packing/unpacking to binary string because MessagePack is
primarily used for inter-process communication, so there is not much point in
keeping the packed data inside a process.

I'd appreciate it a lot if a seasoned Racketeer could take a look at my code,
in particular if the library is set up properly (the info.rkt files), this is
my first time doing something in Racket. I am also open to suggestions about
the API, I haven't committed to version 1.0 yet. In particular, I am not
familiar with the modularity conventions of Racket libraries, i.e. if it is OK
to have 'msgpack/pack' and 'msgpack/unpack' or if everything should be covered
by one large 'provide' from 'msgpack'? There is one new type 'ext' declared,
should that be part of 'msgpack' or should I move it to 'msgpack/types'
instead?

On a related note, I find it really annoying that 'integer->integer-bytes' and
'integer-bytes->integer' do not support 8-bit integers. Is there a reason for
that? I had to write all sorts of ugly extra code for the 8-bit cases. I opened
an issue on GitHub about it (#1754).


### What's next? ###

Once the API settles I would like to move the library to typed Racket. I would
also like to submit it to the Racket packages catalog. The reason I wrote this
library is because I want to eventually write a Racket API client for Neovim:

https://github.com/neovim/neovim 
https://github.com/neovim/neovim/wiki/Related-projects#api-clients 


Neovim is a fork of Vim which aims to stay backwards compatible with Vim, but
at the same time bring the code base to modern standards, add long-wanted
features and make the editor easier to extend. They have already done a lot of
work, such asynchronous job control, a built-in terminal emulator, Lua
scripting and in particular a remote API.

The remote API allows one to write plugins in any language, provided there is a
client for that language. In contrast, Vim has to be compiled with support for
additional scripting languages and the integration burden was on the Vim
developers. This meant that popular languages like Python would be pretty well
supported, but more obscure languages were practically useless because no one
would re-compile their Vim just for one plugin. The remote API approach means
that Racket integration can be de-coupled from the editor development, and we
can write plugins that can make use of Racket libraries. One could for example
implement some of the DrRacket features using DrRacket as a library instead of
re-inventing the wheel. It would also be possible to integrate Neovim inside
DrRacket or write a Neovim GUI in Racket (GUIs are just very complex plugins in
Neovim).

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] How do db handles interact with threads and parameters

2017-07-24 Thread George Neuner

Hi David,

On 7/24/2017 8:18 AM, David Storrs wrote:

What happens in the following code?

(define dbh (postgresql-connect ...))

;; Use the DBH in a new thread
(thread (thunk
  (while ...some long-running condition...
(sleep 1) ; don't flood the DB
(query-exec dbh "insert into users ..."

;; And in the main thread
(query-exec dbh ...)

I now have a database object that's being shared between two threads, 
yes?  Or is the object copied when the new thread is created and, if 
so, what will that do to the underlying connection to the DB?


The single DBMS connection is being shared by the 2 threads.  That's a 
recipe for disaster if both try to use it simultaneously.




New scenario:

(define dbh (postgresql-connect ...))
(define current-dbh (make-parameter dbh))
(query-value (current-dbh) "select ...")
(sleep 100)
(query-value (current-dbh) "select ...")

I would expect the first query-value to work but by the time the 
second runs the handle has probably been disconnected by the database 
and the query will throw an exception.  Am I understanding that properly?


After sleeping 11.6 days?  Probably.   But it depends on the connection 
keep-alive settings.  By default, Postgresql uses the OS settings, which 
are roughly ~130 minutes assuming that the client understands and 
participates [which Racket does].  However, on most systems, keep-alive 
timeouts can be set far higher.


Aside: I'm not aware of any published limits on keep-alive settings, 
only the defaults are known for various systems.  If the settings are 
just integral seconds and counts, and (theoretically) could go to 
MAX_INT, that would represent  ~3^20 years on a 32-bit machine ... for 
all practical purposes, infinite.



Final scenario, identical to the previous except that the parameter is 
a promise:

(define dbh (postgresql-connect ...))
(define current-dbh (make-parameter (delay dbh)))
(query-value (force (current-dbh)) "select ...")
(sleep 100)
(query-value (force (current-dbh)) "select ...")

Would this make any difference?  I can't see why but thought I should ask.


There's no difference wrt the above.  You aren't delaying the *opening* 
of the connection - you are simply performing unnecessary gyrations to 
get at the handle.   There would be a difference if instead you wrote:


   (define dbh (delay (postgresql-connect ...)))
   (define current-dbh (make-parameter dbh))
   (query-value (force (current-dbh)) "select ...")
(sleep ...)
(query-value (force (current-dbh)) "select ...")

which would try to open the connection at the point of each query. 
However, at the 2nd query, if the connection were still open due to 
keep-alive, the repeated open attempt might fail.



George

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Blame for contracts on applicable serializable structs

2017-07-24 Thread Matthias Felleisen

> On Jul 23, 2017, at 10:50 PM, Philip McGrath  wrote:
> 
> If I'm following correctly, I think that's what I was trying to do, but I'm 
> unclear how to give `make-deserialize-info` a variant of `make-adder` that 
> has a contract. The initial example with `define/contract` was the closest 
> I've come: it at least reported violations in terms of `make-adder` rather 
> than `+`, but (as I now understand) it blamed the `server` module for all 
> violations.
> 
> -Philip
> 
> On Sun, Jul 23, 2017 at 9:27 PM, Matthew Flatt  > wrote:
> The original example had an explicit deserializer:
> 
> At Sun, 23 Jul 2017 19:54:43 -0500, Philip McGrath wrote:
> >   (define deserialize-info:adder-v0
> > (make-deserialize-info make-adder
> >(λ () (error 'adder
> > "can't have cycles"
> 
> You're constructing the deserializer with `make-adder` --- the variant
> from inside the `server` module, so it doesn't have a contract.
> 
> I think this is where you want to draw a new boundary by giving
> `make-deserialize-info` a variant of `make-adder` that has a contract.



Don’t you just want this: 

#lang racket

(module server racket
  (require racket/serialize)
  
  (provide (contract-out
[adder (-> natural-number/c (-> natural-number/c 
natural-number/c))]))
  
  (struct adder (base)
#:property prop:procedure
(λ (this x) (+ (adder-base this) x))
#:property prop:serializable
(make-serialize-info (λ (this) (vector (adder-base this)))
 #'deserialize-info:adder-v0
 #f
 (or (current-load-relative-directory)
 (current-directory
  
  (define deserialize-info:adder-v0
(make-deserialize-info adder (λ () (error 'adder "can't have cycles"
  
  (module+ deserialize-info
(provide deserialize-info:adder-v0)))
  
(require (submod "." server) racket/serialize)

(local ((define serialize values)
(define deserialize values))
  (define x (serialize (adder 5)))
  (define f (deserialize x))
  (f 'not-a-number))


-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] How do db handles interact with threads and parameters

2017-07-24 Thread David Storrs
What happens in the following code?

(define dbh (postgresql-connect ...))

;; Use the DBH in a new thread
(thread (thunk
  (while ...some long-running condition...
(sleep 1) ; don't flood the DB
(query-exec dbh "insert into users ..."

;; And in the main thread
(query-exec dbh ...)

I now have a database object that's being shared between two threads, yes?
Or is the object copied when the new thread is created and, if so, what
will that do to the underlying connection to the DB?


New scenario:

(define dbh (postgresql-connect ...))
(define current-dbh (make-parameter dbh))
(query-value (current-dbh) "select ...")
(sleep 100)
(query-value (current-dbh) "select ...")

I would expect the first query-value to work but by the time the second
runs the handle has probably been disconnected by the database and the
query will throw an exception.  Am I understanding that properly?


Final scenario, identical to the previous except that the parameter is a
promise:
(define dbh (postgresql-connect ...))
(define current-dbh (make-parameter (delay dbh)))
(query-value (force (current-dbh)) "select ...")
(sleep 100)
(query-value (force (current-dbh)) "select ...")

Would this make any difference?  I can't see why but thought I should ask.


Obviously I can prevent these issues by using a connection pool or handle
generator function.  I'm simply curious about the underlying functionality.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
> - Are you running this code inside DrRacket? If so, have you timed the
> difference between running it with debugging enabled and with no
> debugging or profiling? (Language -> Choose Language... -> Details)

I am running from terminal with:

```racket -l errortrace -t FILE``` and
```raco test TESTFILE```

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: Decision Tree in Racket - Performance

2017-07-24 Thread Zelphir Kaltstahl
Wow, thanks for all the feedback, I'll try to get most of the mentioned stuff 
done and then post an update : )

> I teach trees and decision trees to freshman students who have never 
> programmed before, and Racket’s forms of data and functions are extremely 
> suitable to this domain.

I think this is relating to using vectors and accessing them by index? How 
would you represent the data? What forms of data are better suited?

Mentioned were:

- struct instead of hash
- list of vectors instead of vector of vectors

Your post made me think of functions themselves. Would it be possible to 
represent the splits as chains of functions and would that have any advantage? 
Is that what you are hinting at?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.