[racket-users] Re: query-exec SQLite3 multiple statements

2017-07-28 Thread Alexander McLin
Yes, you can use DrRacket to debug the file at 
collects/db/private/sqlite3/connection.rkt. That one is likely the most 
relevant one for you.

Alternatively, you could wait till Racket 6.10 is released which is very soon I 
think. The bug fixes to the db library might reveal the correct error message 
being generated by SQLite3.

> 
> It works, but I'd still like to figure out exactly what's wrong. I'm 
> relatively new to racket, is it possible to debug the standard library to see 
> whats going wrong?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Help: How to check a typed/racket type in an untyped Racket contract?

2017-07-28 Thread Matthias Felleisen

After some poking around, here is what I can offer:

#lang racket

(module predicates racket
  (provide discrete-dist-thing?)
  
  ;; -
  ;; dependencies 
  (require math/distributions)

  ;; -
  ;; implementation 
  (define (discrete-dist-thing? x)
(define is-it-discrete-dist-values
  (with-handlers ([exn:fail? (lambda (xn)
   (define msg (exn-message xn))
   (if (regexp-match? #rx"expected: 
discrete-dist-struct" msg)
   #false
   (raise xn)))])
(discrete-dist-values x)))
(and is-it-discrete-dist-values
 (andmap symbol? is-it-discrete-dist-values)
 (let ([probs (discrete-dist-probs x)])
   (and (andmap flonum? probs) (andmap positive? probs))

(module server racket
  (require (submod ".." predicates))
  (provide
   (contract-out
[d discrete-dist-thing?]))

  ;; -
  ;; dependencies 
  (require math/distributions)

  ;; -
  ;; implementation 
  (define d (discrete-dist '(one two

(require 'server)
d


Watch for symbol?. You want to abstract over that. 



> On Jul 28, 2017, at 12:16 PM, Matthias Felleisen  wrote:
> 
>> 
>> On Jul 28, 2017, at 11:23 AM, James Geddes  wrote:
>> 
>> This is likely a typed-untyped interface newbie question.
>> 
>> Briefly, how can I write a contract that asserts that some value has a type 
>> that is defined in an imported typed/racket library?
>> 
>> In more detail, I am using the math/distributions library, which is written 
>> in typed/racket. My module is written in untyped Racket, and I'm figuring 
>> out how to use contacts:
>> 
>> #lang racket
>> (require math/distributions) ;; this is a typed/racket library
>> 
>> (define my-dist (discrete-dist '(thing-one thing-two)))
>> 
>> (provide (contract-out [my-dist ???]))
>> 
>> 
>> The question is: what should ??? be?. There is a distribution? predicate but 
>> I'd quite like to be more specific: namely, that my-dist is a discrete 
>> distribution. In the source of math/distributions, the following type is 
>> defined:
>> 
>>  (define-type (Discrete-Dist A) (discrete-dist-struct A A))
>> 
>> but I don't know how this might translate into my untyped module.
>> 
>> 
>> Any help much appreciated!
> 
> 
> This is a great question. It suggests we should provide these generated 
> contracts (or grant access them) via a modicum of reflection. 
> 
> In the meantime, you want to look at the result type of discrete-dist, which 
> is the type (Discrete-Dist A). This is turn is a subtype of (dist A A), which 
> is opaque/undocumented/I can’t find it in the code base (on the fly). It’s 
> not exported. So at the moment, your will have to take the fall if someone 
> abuses it. 
> 
> — Matthias
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Help: How to check a typed/racket type in an untyped Racket contract?

2017-07-28 Thread Matthias Felleisen

> On Jul 28, 2017, at 11:23 AM, James Geddes  wrote:
> 
> This is likely a typed-untyped interface newbie question.
> 
> Briefly, how can I write a contract that asserts that some value has a type 
> that is defined in an imported typed/racket library?
> 
> In more detail, I am using the math/distributions library, which is written 
> in typed/racket. My module is written in untyped Racket, and I'm figuring out 
> how to use contacts:
> 
> #lang racket
> (require math/distributions) ;; this is a typed/racket library
> 
> (define my-dist (discrete-dist '(thing-one thing-two)))
> 
> (provide (contract-out [my-dist ???]))
> 
> 
> The question is: what should ??? be?. There is a distribution? predicate but 
> I'd quite like to be more specific: namely, that my-dist is a discrete 
> distribution. In the source of math/distributions, the following type is 
> defined:
> 
>   (define-type (Discrete-Dist A) (discrete-dist-struct A A))
> 
> but I don't know how this might translate into my untyped module.
> 
> 
> Any help much appreciated!


This is a great question. It suggests we should provide these generated 
contracts (or grant access them) via a modicum of reflection. 

In the meantime, you want to look at the result type of discrete-dist, which is 
the type (Discrete-Dist A). This is turn is a subtype of (dist A A), which is 
opaque/undocumented/I can’t find it in the code base (on the fly). It’s not 
exported. So at the moment, your will have to take the fall if someone abuses 
it. 

— Matthias

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Help: How to check a typed/racket type in an untyped Racket contract?

2017-07-28 Thread James Geddes
This is likely a typed-untyped interface newbie question.

Briefly, how can I write a contract that asserts that some value has a type 
that is defined in an imported typed/racket library?

In more detail, I am using the math/distributions library, which is written in 
typed/racket. My module is written in untyped Racket, and I'm figuring out how 
to use contacts:

#lang racket
(require math/distributions) ;; this is a typed/racket library

(define my-dist (discrete-dist '(thing-one thing-two)))

(provide (contract-out [my-dist ???]))


The question is: what should ??? be?. There is a distribution? predicate but 
I'd quite like to be more specific: namely, that my-dist is a discrete 
distribution. In the source of math/distributions, the following type is 
defined:

(define-type (Discrete-Dist A) (discrete-dist-struct A A))

but I don't know how this might translate into my untyped module.


Any help much appreciated!

James




---
James Geddes

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Philip McGrath
Specifically, as Robby said earlier, `list?` is memoized, so e.g. (first
(rest (build-list 10 values))) only pays this price once. And `list?`
rejects pairs that have cycles.

-Philip

On Fri, Jul 28, 2017 at 5:51 AM, Matthias Felleisen 
wrote:

>
> first and rest were introduced in the teaching languages first when we
> decided it was about principles of design and cadadar was cute but nobody
> should have care.
>
> first and rest are about lists in other languages and the names say so.
> car and cdr are about pairs (not that their names say so) and should be
> called left and right.
>
> first and rest were applied to immutable cons-es in *SL, so we knew that
> no checks were needed.
>
> first and rest migrated to racket but we wanted them to be like those in
> *SL. So we checked. Fortunately, lists became immutable so
>
> list? is essentially O(1) and has negligible cost.
>
>
>
>
>
> On Jul 28, 2017, at 6:40 AM, Robby Findler 
> wrote:
>
> It could have been. I am not sure why (but it probably had something to do
> with better checking for the teaching languages, Matthias may recall more).
>
> Robby
>
> On Fri, Jul 28, 2017 at 4:19 AM Daniel Prager 
> wrote:
>
>> Interesting stuff, but if I may probe a little deeper into scheme
>> history, why couldn't first have simply been defined as a synonym for car
>> (i.e. first item in a pair) and rest a synonym for cdr?
>>
>> Dan
>>
>>
>> On Fri, Jul 28, 2017 at 6:09 PM, Neil Van Dyke 
>> wrote:
>>
>>> Daniel Prager wrote on 07/28/2017 01:03 AM:
>>>
 > `first` and `rest` need to check if the input is a list.

 Why is that?

>>>
>>> When Racket/Scheme/Lisp people speak of checking whether something is a
>>> list, I think they usually mean in the sense of the predicate `list?` (or
>>> `listp`), which is usually an O(n) test for whether a value is a
>>> properly-formed list of pairs (or cons cells) terminated by a null.
>>>
>>> One reason people sometimes do the expensive list check is because it
>>> can be good practice (to have your procedure check its arguments at entry,
>>> so it can say the bug is not its fault, as early as possible).  Teaching
>>> languages are one place I don't think anyone would question whether this is
>>> good practice.
>>>
>>> Another possible reason is that, unless the cost of `list?` has
>>> previously been pointed out to you, it's easy to just use it without even
>>> considering whether it might be expensive.  Even once I knew this, I still
>>> did this at least once, myself.
>>>
>>> When writing list-processing code that has to deal with potentially
>>> large lists, if you're not using a type system that avoids the expensive
>>> `list?` cost, IMHO, it's usually OK for an initial argument type check to
>>> use a `pair?`/`null?` in lieu of `list?`, or possibly do no initial check
>>> at all.  If you want to provide better exceptions on improper lists, you
>>> can check for list correctness as you proceed with your algorithm, and
>>> raise an exception yourself (such as including the original argument value,
>>> which info would not be in the exception if, say, `cdr` raised it).
>>>
>>> BTW, if you want to be professional about list-processing code you write
>>> for use with potentially arbitrary lists, it's good to remember that lists
>>> can have cycles, and... uh, you can usually just document that "behavior is
>>> undefined" for lists with cycles. :) Especially since today you're usually
>>> dealing with immutable pairs, in modern Racket, where cycles are much less
>>> likely than they were when we used `set-cdr!`.  (Or, if you can't afford to
>>> hang in an infinite loop on cycles that are reasonably plausible, and you
>>> don't mind slowing down and complicating things a little, and you don't
>>> mind having a maximum list size for your procedure, you can keep a counter
>>> during what could otherwise be an infinite loop in your algorithm, which
>>> you can then keep checking, to decide whether to raise an exception.  Or
>>> you can do it a harder way, like one of "https://en.wikipedia.org/
>>> wiki/Cycle_detection".[1]  Or you could use a less-basic
>>> language/platform facility, and raise an exception if some limit of compute
>>> resource or time is reached for that procedure call.)
>>>
>>>
>>> [1] Tangential story about school and industry.  The first time that I
>>> hit a speed bump in a software development interview was a long time ago,
>>> when some CS student co-founder of a video game startup asked me how to
>>> detect whether a linked list had cycles.  So, having some schooling and
>>> experience by that time, but not recalling this problem from when I took
>>> Data Structures & Algorithms years earlier, I started working through the
>>> problem, while trying to ignore distracting social cues that the
>>> interviewer might not have realized he was making--  Then the interviewer
>>> tells me he'd recently 

Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Matthias Felleisen

first and rest were introduced in the teaching languages first when we decided 
it was about principles of design and cadadar was cute but nobody should have 
care. 

first and rest are about lists in other languages and the names say so. car and 
cdr are about pairs (not that their names say so) and should be called left and 
right. 

first and rest were applied to immutable cons-es in *SL, so we knew that no 
checks were needed. 

first and rest migrated to racket but we wanted them to be like those in *SL. 
So we checked. Fortunately, lists became immutable so 

list? is essentially O(1) and has negligible cost. 





> On Jul 28, 2017, at 6:40 AM, Robby Findler  
> wrote:
> 
> It could have been. I am not sure why (but it probably had something to do 
> with better checking for the teaching languages, Matthias may recall more). 
> 
> Robby
> 
> On Fri, Jul 28, 2017 at 4:19 AM Daniel Prager  > wrote:
> Interesting stuff, but if I may probe a little deeper into scheme history, 
> why couldn't first have simply been defined as a synonym for car (i.e. first 
> item in a pair) and rest a synonym for cdr?
> 
> Dan
> 
> 
> On Fri, Jul 28, 2017 at 6:09 PM, Neil Van Dyke  > wrote:
> Daniel Prager wrote on 07/28/2017 01:03 AM:
> > `first` and `rest` need to check if the input is a list.
> 
> Why is that?
> 
> When Racket/Scheme/Lisp people speak of checking whether something is a list, 
> I think they usually mean in the sense of the predicate `list?` (or `listp`), 
> which is usually an O(n) test for whether a value is a properly-formed list 
> of pairs (or cons cells) terminated by a null.
> 
> One reason people sometimes do the expensive list check is because it can be 
> good practice (to have your procedure check its arguments at entry, so it can 
> say the bug is not its fault, as early as possible).  Teaching languages are 
> one place I don't think anyone would question whether this is good practice.
> 
> Another possible reason is that, unless the cost of `list?` has previously 
> been pointed out to you, it's easy to just use it without even considering 
> whether it might be expensive.  Even once I knew this, I still did this at 
> least once, myself.
> 
> When writing list-processing code that has to deal with potentially large 
> lists, if you're not using a type system that avoids the expensive `list?` 
> cost, IMHO, it's usually OK for an initial argument type check to use a 
> `pair?`/`null?` in lieu of `list?`, or possibly do no initial check at all.  
> If you want to provide better exceptions on improper lists, you can check for 
> list correctness as you proceed with your algorithm, and raise an exception 
> yourself (such as including the original argument value, which info would not 
> be in the exception if, say, `cdr` raised it).
> 
> BTW, if you want to be professional about list-processing code you write for 
> use with potentially arbitrary lists, it's good to remember that lists can 
> have cycles, and... uh, you can usually just document that "behavior is 
> undefined" for lists with cycles. :) Especially since today you're usually 
> dealing with immutable pairs, in modern Racket, where cycles are much less 
> likely than they were when we used `set-cdr!`.  (Or, if you can't afford to 
> hang in an infinite loop on cycles that are reasonably plausible, and you 
> don't mind slowing down and complicating things a little, and you don't mind 
> having a maximum list size for your procedure, you can keep a counter during 
> what could otherwise be an infinite loop in your algorithm, which you can 
> then keep checking, to decide whether to raise an exception.  Or you can do 
> it a harder way, like one of "https://en.wikipedia.org/wiki/Cycle_detection 
> ".[1]  Or you could use a 
> less-basic language/platform facility, and raise an exception if some limit 
> of compute resource or time is reached for that procedure call.)
> 
> 
> [1] Tangential story about school and industry.  The first time that I hit a 
> speed bump in a software development interview was a long time ago, when some 
> CS student co-founder of a video game startup asked me how to detect whether 
> a linked list had cycles.  So, having some schooling and experience by that 
> time, but not recalling this problem from when I took Data Structures & 
> Algorithms years earlier, I started working through the problem, while trying 
> to ignore distracting social cues that the interviewer might not have 
> realized he was making--  Then the interviewer tells me he'd recently given 
> the same question to a CS student from his school, who'd immediately named 
> "the" algorithm, and who'd then proceeded to derive a proof of it on the 
> whiteboard.  Then the interviewer vetoed me as a hire, despite the CS systems 
> cred for 

Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Robby Findler
It could have been. I am not sure why (but it probably had something to do
with better checking for the teaching languages, Matthias may recall more).

Robby

On Fri, Jul 28, 2017 at 4:19 AM Daniel Prager 
wrote:

> Interesting stuff, but if I may probe a little deeper into scheme history,
> why couldn't first have simply been defined as a synonym for car (i.e.
> first item in a pair) and rest a synonym for cdr?
>
> Dan
>
>
> On Fri, Jul 28, 2017 at 6:09 PM, Neil Van Dyke 
> wrote:
>
>> Daniel Prager wrote on 07/28/2017 01:03 AM:
>>
>>> > `first` and `rest` need to check if the input is a list.
>>>
>>> Why is that?
>>>
>>
>> When Racket/Scheme/Lisp people speak of checking whether something is a
>> list, I think they usually mean in the sense of the predicate `list?` (or
>> `listp`), which is usually an O(n) test for whether a value is a
>> properly-formed list of pairs (or cons cells) terminated by a null.
>>
>> One reason people sometimes do the expensive list check is because it can
>> be good practice (to have your procedure check its arguments at entry, so
>> it can say the bug is not its fault, as early as possible).  Teaching
>> languages are one place I don't think anyone would question whether this is
>> good practice.
>>
>> Another possible reason is that, unless the cost of `list?` has
>> previously been pointed out to you, it's easy to just use it without even
>> considering whether it might be expensive.  Even once I knew this, I still
>> did this at least once, myself.
>>
>> When writing list-processing code that has to deal with potentially large
>> lists, if you're not using a type system that avoids the expensive `list?`
>> cost, IMHO, it's usually OK for an initial argument type check to use a
>> `pair?`/`null?` in lieu of `list?`, or possibly do no initial check at
>> all.  If you want to provide better exceptions on improper lists, you can
>> check for list correctness as you proceed with your algorithm, and raise an
>> exception yourself (such as including the original argument value, which
>> info would not be in the exception if, say, `cdr` raised it).
>>
>> BTW, if you want to be professional about list-processing code you write
>> for use with potentially arbitrary lists, it's good to remember that lists
>> can have cycles, and... uh, you can usually just document that "behavior is
>> undefined" for lists with cycles. :) Especially since today you're usually
>> dealing with immutable pairs, in modern Racket, where cycles are much less
>> likely than they were when we used `set-cdr!`.  (Or, if you can't afford to
>> hang in an infinite loop on cycles that are reasonably plausible, and you
>> don't mind slowing down and complicating things a little, and you don't
>> mind having a maximum list size for your procedure, you can keep a counter
>> during what could otherwise be an infinite loop in your algorithm, which
>> you can then keep checking, to decide whether to raise an exception.  Or
>> you can do it a harder way, like one of "
>> https://en.wikipedia.org/wiki/Cycle_detection".[1]  Or you could use a
>> less-basic language/platform facility, and raise an exception if some limit
>> of compute resource or time is reached for that procedure call.)
>>
>>
>> [1] Tangential story about school and industry.  The first time that I
>> hit a speed bump in a software development interview was a long time ago,
>> when some CS student co-founder of a video game startup asked me how to
>> detect whether a linked list had cycles.  So, having some schooling and
>> experience by that time, but not recalling this problem from when I took
>> Data Structures & Algorithms years earlier, I started working through the
>> problem, while trying to ignore distracting social cues that the
>> interviewer might not have realized he was making--  Then the interviewer
>> tells me he'd recently given the same question to a CS student from his
>> school, who'd immediately named "the" algorithm, and who'd then proceeded
>> to derive a proof of it on the whiteboard.  Then the interviewer vetoed me
>> as a hire, despite the CS systems cred for which his co-founder recruited
>> me in the first place, not that I'm still bitter. :)  So, if you didn't
>> know/remember that there are off-the-shelf algorithms for cycle detection,
>> now you do, and, when you someday interview others, consider not being a
>> punk about it. :)
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Racket Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to racket-users+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> For 

Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Neil Van Dyke
As soon as I sent it, I saw I was conflating lists with what CL calls 
proper lists.  I'll just link:


http://docs.racket-lang.org/reference/pairs.html
http://clhs.lisp.se/Body/t_list.htm
http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-9.html#%_sec_6.3.2

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Daniel Prager
Interesting stuff, but if I may probe a little deeper into scheme history,
why couldn't first have simply been defined as a synonym for car (i.e.
first item in a pair) and rest a synonym for cdr?

Dan

On Fri, Jul 28, 2017 at 6:09 PM, Neil Van Dyke  wrote:

> Daniel Prager wrote on 07/28/2017 01:03 AM:
>
>> > `first` and `rest` need to check if the input is a list.
>>
>> Why is that?
>>
>
> When Racket/Scheme/Lisp people speak of checking whether something is a
> list, I think they usually mean in the sense of the predicate `list?` (or
> `listp`), which is usually an O(n) test for whether a value is a
> properly-formed list of pairs (or cons cells) terminated by a null.
>
> One reason people sometimes do the expensive list check is because it can
> be good practice (to have your procedure check its arguments at entry, so
> it can say the bug is not its fault, as early as possible).  Teaching
> languages are one place I don't think anyone would question whether this is
> good practice.
>
> Another possible reason is that, unless the cost of `list?` has previously
> been pointed out to you, it's easy to just use it without even considering
> whether it might be expensive.  Even once I knew this, I still did this at
> least once, myself.
>
> When writing list-processing code that has to deal with potentially large
> lists, if you're not using a type system that avoids the expensive `list?`
> cost, IMHO, it's usually OK for an initial argument type check to use a
> `pair?`/`null?` in lieu of `list?`, or possibly do no initial check at
> all.  If you want to provide better exceptions on improper lists, you can
> check for list correctness as you proceed with your algorithm, and raise an
> exception yourself (such as including the original argument value, which
> info would not be in the exception if, say, `cdr` raised it).
>
> BTW, if you want to be professional about list-processing code you write
> for use with potentially arbitrary lists, it's good to remember that lists
> can have cycles, and... uh, you can usually just document that "behavior is
> undefined" for lists with cycles. :) Especially since today you're usually
> dealing with immutable pairs, in modern Racket, where cycles are much less
> likely than they were when we used `set-cdr!`.  (Or, if you can't afford to
> hang in an infinite loop on cycles that are reasonably plausible, and you
> don't mind slowing down and complicating things a little, and you don't
> mind having a maximum list size for your procedure, you can keep a counter
> during what could otherwise be an infinite loop in your algorithm, which
> you can then keep checking, to decide whether to raise an exception.  Or
> you can do it a harder way, like one of "https://en.wikipedia.org/wiki
> /Cycle_detection".[1]  Or you could use a less-basic language/platform
> facility, and raise an exception if some limit of compute resource or time
> is reached for that procedure call.)
>
>
> [1] Tangential story about school and industry.  The first time that I hit
> a speed bump in a software development interview was a long time ago, when
> some CS student co-founder of a video game startup asked me how to detect
> whether a linked list had cycles.  So, having some schooling and experience
> by that time, but not recalling this problem from when I took Data
> Structures & Algorithms years earlier, I started working through the
> problem, while trying to ignore distracting social cues that the
> interviewer might not have realized he was making--  Then the interviewer
> tells me he'd recently given the same question to a CS student from his
> school, who'd immediately named "the" algorithm, and who'd then proceeded
> to derive a proof of it on the whiteboard.  Then the interviewer vetoed me
> as a hire, despite the CS systems cred for which his co-founder recruited
> me in the first place, not that I'm still bitter. :)  So, if you didn't
> know/remember that there are off-the-shelf algorithms for cycle detection,
> now you do, and, when you someday interview others, consider not being a
> punk about it. :)
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Neil Van Dyke

Daniel Prager wrote on 07/28/2017 01:03 AM:

> `first` and `rest` need to check if the input is a list.

Why is that?


When Racket/Scheme/Lisp people speak of checking whether something is a 
list, I think they usually mean in the sense of the predicate `list?` 
(or `listp`), which is usually an O(n) test for whether a value is a 
properly-formed list of pairs (or cons cells) terminated by a null.


One reason people sometimes do the expensive list check is because it 
can be good practice (to have your procedure check its arguments at 
entry, so it can say the bug is not its fault, as early as possible).  
Teaching languages are one place I don't think anyone would question 
whether this is good practice.


Another possible reason is that, unless the cost of `list?` has 
previously been pointed out to you, it's easy to just use it without 
even considering whether it might be expensive.  Even once I knew this, 
I still did this at least once, myself.


When writing list-processing code that has to deal with potentially 
large lists, if you're not using a type system that avoids the expensive 
`list?` cost, IMHO, it's usually OK for an initial argument type check 
to use a `pair?`/`null?` in lieu of `list?`, or possibly do no initial 
check at all.  If you want to provide better exceptions on improper 
lists, you can check for list correctness as you proceed with your 
algorithm, and raise an exception yourself (such as including the 
original argument value, which info would not be in the exception if, 
say, `cdr` raised it).


BTW, if you want to be professional about list-processing code you write 
for use with potentially arbitrary lists, it's good to remember that 
lists can have cycles, and... uh, you can usually just document that 
"behavior is undefined" for lists with cycles. :) Especially since today 
you're usually dealing with immutable pairs, in modern Racket, where 
cycles are much less likely than they were when we used `set-cdr!`.  
(Or, if you can't afford to hang in an infinite loop on cycles that are 
reasonably plausible, and you don't mind slowing down and complicating 
things a little, and you don't mind having a maximum list size for your 
procedure, you can keep a counter during what could otherwise be an 
infinite loop in your algorithm, which you can then keep checking, to 
decide whether to raise an exception.  Or you can do it a harder way, 
like one of "https://en.wikipedia.org/wiki/Cycle_detection".[1]  Or you 
could use a less-basic language/platform facility, and raise an 
exception if some limit of compute resource or time is reached for that 
procedure call.)



[1] Tangential story about school and industry.  The first time that I 
hit a speed bump in a software development interview was a long time 
ago, when some CS student co-founder of a video game startup asked me 
how to detect whether a linked list had cycles.  So, having some 
schooling and experience by that time, but not recalling this problem 
from when I took Data Structures & Algorithms years earlier, I started 
working through the problem, while trying to ignore distracting social 
cues that the interviewer might not have realized he was making--  Then 
the interviewer tells me he'd recently given the same question to a CS 
student from his school, who'd immediately named "the" algorithm, and 
who'd then proceeded to derive a proof of it on the whiteboard.  Then 
the interviewer vetoed me as a hire, despite the CS systems cred for 
which his co-founder recruited me in the first place, not that I'm still 
bitter. :)  So, if you didn't know/remember that there are off-the-shelf 
algorithms for cycle detection, now you do, and, when you someday 
interview others, consider not being a punk about it. :)


--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Zelphir Kaltstahl
In general what is the opinion on replacing `first` and `rest` with `car` and 
`cdr`, when considering readability?

I find `first` and `rest` very readable and remember that I got quite confused 
when I started learning Racket with all the cadrdrdrd ;) I still think `first` 
and `rest` reads better, but if the performance is that much worse and even 
needs an O(n) check for `list?` ... maybe it's not the right choice.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Zelphir Kaltstahl
On Friday, July 28, 2017 at 2:02:56 AM UTC+2, gustavo wrote:
> I agree with the in-list explanation, but I want to remark a few details.
> 
> >> I don't really understand the (in-list ...) thing. This seems to be 
> >> internal magic to me.
> 
> `in-list` is not a function, it's a macro that looks like a function.
> `for` is another macro (that doesn't look like a function). But in
> Racket the macros are not just straightforward text substitution, they
> can do a lot of things. Most are simple, but some macros are very
> complex. For example `for` can examine some additional information
> stored in `in-list` and write very efficient code. I.e. magic, a lot
> of magic, but you can write your own magical `in-something`. The
> details are in https://docs.racket-lang.org/reference/for.html but
> it's not easy so I recommend reading it later after playing more with
> the simple macros.
> 
> I looked at your solution with vectors. I think that vectors are not
> slower than lists, moreover, they should be slightly faster, but
> sometimes I guess wrong. The problem with your implementation is that
> it creates a lot of intermediate lists. For example in
>(apply + (map (lambda (class-label) ...) class-labels)
> The intermediate list must be allocated and free after use, the values
> are scattered in memory so it may be slower to use them.
> 
> Also, the compiler can write more efficient code for the + in (+ x y)
> than for the + in (apply + ...).
> 
> The solution of Daniel use (for/sum ...) instead of (apply + (map ...)).
> 
> There are a few places that can be rewritten. My recommendation for
> fast code is to try to avoid allocations when possible.
> 
> Gustavo

Thanks, will try to get it done soon! I also don't like the readability aspect 
of some part of code I have, like the nested map summing and initially already 
thought that there must be something better available. It's Racket after all. 
So thanks for pointing it out.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Decision Tree in Racket - Performance

2017-07-28 Thread Zelphir Kaltstahl
On Thursday, July 27, 2017 at 10:17:00 PM UTC+2, Daniel Prager wrote:
> > Wow, are those timings for the "big" data set?!
> 
> 
> I use 1/5 of the data as a training set, in line with my understanding of the 
> original article, which splits it in 5.
> 
> I use the remaining 4/5 as a single validation set rather than four separate 
> sets (because I got lazy). That won't impact the speed since validating the 
> model is fast.
> 
> > When I tried I could not run @Daniel's solution. There was some kind of 
> > error. I am running Racket 6.8. Should I update to 6.9, could that be the 
> > issue?
> 
> I'm running on 6.8 as well, so it's not that.
> 
> Are you running on Windows or Linux (I'm on Mac)? 
> 
> 
> You may need to modify the string-split to correctly parse line-endings per 
> Gustavo:
> 
> 
> 
> (string-split s #rx"\r?\n")
> 
> 
> 
> Is that it?
> 
> Dan
> 
> 
> 
> 
> On Fri, Jul 28, 2017 at 6:01 AM, Zelphir Kaltstahl  
> wrote:
> 
> 
> On Wednesday, July 26, 2017 at 3:17:46 PM UTC+2, gustavo wrote:
> 
> > I read the solution of Daniel Prager and I have a few minor changes.
> 
> >
> 
> > * First I added a test that repeats `build-tree` 20 times, so the run
> 
> > time is approximately 1-2 seconds. This is not necessary, but times
> 
> > smaller than 1 second are sometimes not reliable. I'm using:
> 
> > ;---
> 
> > (random-seed 12345)
> 
> > (define data2 (shuffle banknote-data))
> 
> > (time
> 
> >  (void
> 
> >   (build-tree (take data2 274) 5 10)))
> 
> > (time
> 
> >  (for ([i (in-range 20)])
> 
> >    (build-tree (take data2 274) 5 10)))
> 
> > ;---
> 
> >
> 
> >
> 
> >
> 
> > * Second I modified the split function so it is usable in Windows and
> 
> > Linux. I'm using:
> 
> >     (string-split s #rx"\r?\n")
> 
> > I'm almost sure this is not the best method to write portable code.
> 
> > Any recommendation?
> 
> >
> 
> >
> 
> >
> 
> > * The most important change is to add `in-list` everywhere, i.e. replace
> 
> >     (for/sum ([split splits])  ...)
> 
> > with
> 
> >     (for/sum ([split (in-list splits)])  ...)
> 
> >
> 
> > I'm not sure that all of them are necessary. The `for` clauses without
> 
> > `in-list` are nice because they are very generic and can iterate over
> 
> > a lot of different data types. The problem is that they create a
> 
> > auxiliary sequence and use the methods of the sequence to iterate, so
> 
> > they are slower than expected. If you use `in-list`, the `for` is
> 
> > expanded to code that is specific to lists and is almost as efficient
> 
> > as the hand coded version like (let loop ([l my-list]) ...) and
> 
> > sometimes better.
> 
> >
> 
> > Most of the times you can use the generic version and use the code
> 
> > with any data type for free, but in the spots where you need fast code
> 
> > remember to use in-range, in-list, in-vector, ... (For the version
> 
> > that uses vectors for internal representation, you should probably use
> 
> > in-vector. I didn't look at the code.)
> 
> >
> 
> > Before this change a typical run time of the test is
> 
> >    cpu time: 157 real time: 165 gc time: 0
> 
> >    cpu time: 3390 real time: 3393 gc time: 63      (20 times)
> 
> >
> 
> > After this change a typical run time of the test is
> 
> >    cpu time: 62 real time: 61 gc time: 0
> 
> >    cpu time: 1266 real time: 1277 gc time: 62     (20 times)
> 
> >
> 
> > Gustavo
> 
> >
> 
> > PS: I made a PR with these changes in github.
> 
> 
> 
> Wow, are those timings for the "big" data set?!
> 
> 
> 
> This is about 20x faster than the vector solution. I wonder where I am 
> loosing so much time. Still not finished implementing everything, but I can 
> build a tree now with the vector solution. One guess I have is, that I am 
> currently (partly for debugging purposes) storing subsets of data in each 
> node of the tree, instead of only the split value. Another is, that I am 
> using structs (instead of minimalistic lists? tagged data?) to represent 
> nodes and they are constructed etc.
> 
> 
> 
> But other than that I have no clue what is using up so much time. Maybe I can 
> time every procedure in DrRacket somehow and see where the most time is spend.
> 
> 
> 
> I am also curious about the Typed Racket version and consider doing this once 
> I am done with the vector solution as well.
> 
> 
> 
> I don't really understand the (in-list ...) thing. This seems to be internal 
> magic to me. Maybe it informs about the type of what is iterated over and 
> that makes it faster?
> 
> 
> 
> When I tried I could not run @Daniel's solution. There was some kind of 
> error. I am running Racket 6.8. Should I update to 6.9, could that be the 
> issue?
> 
> 
> 
> -- 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Daniel Prager
> Agile/Lean Coaching, Innovation, and Leadership
> Profile: skillfire.co/dan
> Startups: youpatch.com, skillfire.co
> Twitter: @agilejitsu 
> Blog: agile-jitsu.blogspot.com
> Linkedin: au.linkedin.com/in/danielaprager

Yes that's it. Thanks! I merged the pull