Re: Question about PANDAS

2014-10-20 Thread Johann Hibschman
giacomo boffi pec...@pascolo.net writes:

  2. choose ONE flavour of python, either 2.7.x or 3.4.x
 - future is with 3.4,
 - most exaples you'll find were written (are still written...)
   for 2.7.x

If you're interested in statistics (as comparisons to R suggest), I'd
recommend anaconda.  It comes with pandas built-in, for one.  I'd also
suggest the 3.4 version.  Finally, just over the past few months, we've
crossed over to where 3.4 is fully-functional in the anaconda
distribution.  (For a while statsmodels was the hold-out; matplotlib had
problems before that.  Now, though, all is good.)

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: I'm looking to start a team of developers, quants, and financial experts, to setup and manage an auto-trading-money-making-machine

2014-10-14 Thread Johann Hibschman
Marko Rauhamaa ma...@pacujo.net writes:

 ryguy7272 ryanshu...@gmail.com:

 I'm looking to start a team of developers, quants, and financial
 experts, to setup and manage an auto-trading-money-making-machine

 This has already been done: http://en.wikipedia.org/wiki/Sampo

And mocked by MST3K (sampo means flavor!):

  https://www.youtube.com/watch?v=cdfUkrbNvwA

-Johann (whose cousins are all Mattinens and Nikkanens)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: GCD in Fractions

2014-09-24 Thread Johann Hibschman
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

 blindanagram wrote:

 Seccondly (as others here have pointed out), the mathematical properties
 of the greatest common divisor are well defined for both positive and
 negative integers.

 You keep saying that, but it simply is not true. Different people use
 different definitions. Some refuse to allow negative arguments at all. Some
 insist that the GCD must be positive. Others allow it to be negative.

I can't find a good source for allowing it to be negative, though.
Clearly, the primary use of the function is on the positive integers,
with the negatives being an extension.

 Mathworld does show one thing that suggests an interpretation for the GCD of
 negative values:

  The GCD is distributive
 GCD(ma,mb)=mGCD(a,b) 

 which tells us that:

 GCD(-x, -y) = -GCD(x, y)

 And yet, Mathematica has:

 GCD(-x, -y) = GCD(x, y)

 the very opposite of what Mathworld says, despite coming from the same
 people.

This is most likely simply them dropping the constraint that m must be
non-negative.  Wikipedia, for example, specifies it under Properties.

 The Collins Dictionary of Mathematics (second edition, 2002) says:

 highest common factor, greatest common factor, or greatest 
 common divisor (abbrev hcf, gcf, gcd)

 n, an integer d that exactly divides (sense 2) two given 
 integers a and b, and is such that if c divides a and b, 
 then c divides d; this definition extends to finite sets 
 of integers and to integral domains. For example, the 
 highest common factor of 12, 60 and 84 is 12.

 Yet again, we have no clear definition for negative values.

As pointed out, this definition always yields two values (positive and
negative), even for positive a and b, so there's nothing special for
negative a or b.  Typically, I've seen this augmented with choose the
positive one to get a single value.

 Here's an example using Euclid's algorithm to calculate the GCD of negative
 numbers, and sure enough, you get a negative result:

The algorithm is pretty irrelevant here.  gcd's not defined by a
particular algorithm to calculate it.

From everything that I've seen, mathematicians consider the gcd to be
always positive.

Now, that's not saying that fraction should implement the mathematical
gcd, if it doesn't need it.  That should be its own argument, though;
it doesn't help to add false doubt about what the gcd of negative
numbers should be.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why Python 4.0 won't be like Python 3.0

2014-08-19 Thread Johann Hibschman
Skip Montanaro s...@pobox.com writes:

 On Tue, Aug 19, 2014 at 9:27 AM, Grant Edwards invalid@invalid.invalid 
 wrote:
 I'm probably conflating the 1.5.2/2.0 and the 2.6 stuff.  I do
 remember delaying moving from 1.5.2 - 2.0 until I really had to, but
 I don't remember why.

 If you were a RedHat user during that timeframe, that might have
 contributed to your decision to delay. I no longer remember the
 details, but it was rather painful.

I vaguely remember holding off for a while until SWIG had 2.0 support,
or maybe Numeric lagged, or something, but that's getting pretty fuzzy.
There was definitely more there than, say, for 1.4 to 1.5.  It's hard to
believe that the Dubois/Hinsen/Hugunin article in Computers in Physics
(which was where I got my start with python) was a full 18 years ago.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python in financial services

2014-08-12 Thread Johann Hibschman
Rustom Mody rustompm...@gmail.com writes:

 Ive been asked to formulate a python course for financial services
 folk.

 If I actually knew about the subject, I'd have fatter pockets!
 Anyway heres some thoughts. What I am missing out?

Good luck!  It's a pretty broad field, so everyone probably has
different needs.

 - Libraries -- Decimal?

I've never seen decimal used, even though it makes sense for
accounting-style finance.  I've mostly been looking at forecasts,
trading, and risk, where floats are fine.  So maybe mention that it
exists, so people know where to look if they need it, but don't stress
it.

 - scripts -- philosophy and infrastructure eg argparse, os.path

Basic argparse is very handy, but, again, I wouldn't spend too much time
on it.

 - Pandas
 - Numpy Scipy (which? how much?)

For me, pandas is huge, numpy is a nice fundamental substrate, while
only bits and pieces of scipy are used (mostly optimization).
statsmodels may also be worth a mention, as the answer to how do I do a
regression.

 - ipython + matplotlib + ??

Ipython notebook + matplotlib is great.  At least show that it exists.
pandas plots may be enough, though.

 - Database interfacing

Definitely mention.

 - Excel interfacing (couple of libraries.. which?)

Meh, maybe.  At least give a strategy.  It always seems like a fool's
errand, though: I end up just dumping data to CSV and using that.

 - C(C++?) interfacing paradigms -- ranging from ctypes, cython to
   classic lo-level

Probably not, but it depends on the audience.  The overview, like
ctypes will link to C-like libraries, cython lets you write python-like
code that runs fast, and there's SWIG and Boost.Python if you want to
write your own modules is about all you need.

Hope that helps,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: NaN comparisons - Call For Anecdotes

2014-07-17 Thread Johann Hibschman
Anders J. Munch 2...@jmunch.dk writes:
 So far I received exactly the answer I was expecting.  0 examples of
 NaN!=NaN being beneficial.
 I wasn't asking for help, I was making a point.  Whether that will
 lead to improvement of Python, well, I'm not too optimistic, but I
 feel the point was worth making regardless.

Well, I just spotted this thread.  An easy example is, well, pretty much
any case where SQL NULL would be useful.  Say I have lists of borrowers,
the amount owed, and the amount they paid so far.

nan = float(nan)
borrowers = [Alice, Bob, Clem, Dan]
amount_owed = [100.0, nan, 200.0, 300.0]
amount_paid = [100.0, nan, nan, 200.0]
who_paid_off = [b for (b, ao, ap) in
  zip(borrowers, amount_owed, amount_paid)
  if ao == ap]

I want to just get Alice from that list, not Bob.  I don't know how much
Bow owes or how much he's paid, so I certainly don't know that he's paid
off his loan.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: NaN comparisons - Call For Anecdotes

2014-07-17 Thread Johann Hibschman
Chris Angelico ros...@gmail.com writes:

 But you also don't know that he hasn't. NaN doesn't mean unknown, it
 means Not a Number. You need a more sophisticated system that allows
 for uncertainty in your data.

Regardless of whether this is the right design, it's still an example of
use.

As to the design, using NaN to implement NA is a hack with a long
history, see

  http://www.numpy.org/NA-overview.html

for some color.  Using NaN gets us a hardware-accelerated implementation
with just about the right semantics.  In a real example, these lists are
numpy arrays with tens of millions of elements, so this isn't a trivial
benefit.  (Technically, that's what's in the database; a given analysis
may look at a sample of 100k or so.)

 You have a special business case here (the need to
 record information with a maybe state), and you need to cope with
 it, which means dedicated logic and planning and design and code.

Yes, in principle.  In practice, everyone is used to the semantics of
R-style missing data, which are reasonably well-matched by nan.  In
principle, (NA == 1.0) should be a NA (missing) truth value, as should
(NA == NA), but in practice having it be False is more useful.  As an
example, indexing R vectors by a boolean vector containing NA yields NA
results, which is a feature that I never want.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: grimace: a fluent regular expression generator in Python

2013-07-17 Thread Johann Hibschman
Ben Last b...@benlast.com writes:

 Good points. I wanted to find a syntax that allows comments as well as
 being fluent:
 RE()
 .any_number_of.digits # Recall that any_number_of includes zero 
 .followed_by.an_optional.dot.then.at_least_one.digit # The dot is
 specifically optional
 # but we must have one digit as a minimum
 .as_string()

Speaking of syntax, have you looked at pyparsing?  I like their
pattern-matching syntax, and I can see it being applied to regexes.

They use an operator-heavy syntax, like:

'(' + digits * 3 + ')-' + digits * 3 + '-' + digits * 4

That seems easier for me to read than the foo.then.follow syntax.

That then makes me think of ometa, which is a fun read, but probably not
completely relevant.

Regards,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Scripting Language for Embedded Work?

2013-07-10 Thread Johann Hibschman
David T. Ashley dash...@gmail.com writes:

 We develop embedded software for 32-bit micros using Windows as the
 development platform.
...
 I know that Tcl/Tk would do all of the above, but what about Python?
 Any other alternatives?

Given that list, I'd say just use Tcl and be done.  You could force the
square peg of python into that round hole, but I doubt it'd be worth the
effort.

Cheers,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding NaN in JSON

2013-04-17 Thread Johann Hibschman
Miki Tebeka miki.teb...@gmail.com writes:

 I'm trying to find a way to have json emit float('NaN') as 'N/A'.
 No.  There is no way to represent NaN in JSON.  It's simply not part of the
 specification.
 I know that. I'm trying to emit the *string* 'N/A' for every NaN.

Easiest way is probably to transform your object before you try to write
it, e.g.

  def transform(x):
  if isinstance(x, dict):
  return dict((k, transform(v)) for k, v in x.items())
  elif isinstance(x, list) or isinstance(x, tuple):
  return [transform(v) for v in x]
  elif isinstance(x, float) and x != x:
  return 'N/A'
  else:
  return x

Then just use

  json.dumps(transform(x))

rather than just

  json.dumps(x)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: allow line break at operators

2011-08-15 Thread Johann Hibschman
Chris Angelico ros...@gmail.com writes:

 Why is left-to-right inherently more logical than
 multiplication-before-addition? Why is it more logical than
 right-to-left? And why is changing people's expectations more logical
 than fulfilling them? Python uses the + and - symbols to mean addition
 and subtraction for good reason. Let's not alienate the mathematical
 mind by violating this rule. It would be far safer to go the other way
 and demand parentheses on everything.

I'm a clearly a fool for allowing myself to be drawn into this thread,
but I've been playing a lot recently with the APL-derivative language J,
which uses a right-to-left operator precendence rule.

Pragmatically, this is because J defines roughly a bajillion operators,
and it would be impossible to remember the precendence of them all, but
it makes sense in its own way.

If you read 3 * 10 + 7, using right-to-left, you get three times
something.  Then you read more and you get three times (ten plus
something).  And finally, you get 3*(10+7).  The prefix gives the
continuation for the rest of the calculation; no matter what you
substitute for X in 3*X, you will always just evaluate X, then multply
it by 3.  Likewise, for 3*10+X, no matter what X is, you know you'll
add 10 and multiply by 3.

This took me a while to get used to, but it's definitely a nice
property.  Not much to do with python, but I do like the syntax enough
that I've implemented my own toy evaluator for J-like expressions in
python, to get around the verbosity of some bits of numpy.

Regards,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I am fed up with Python GUI toolkits...

2011-07-20 Thread Johann Hibschman
Thomas Jollans t...@jollybox.de writes:

 On 20/07/11 04:12, sturlamolden wrote:
 3. Unpythonic memory management: Python references to deleted C++
 objects (PyQt). Manual dialog destruction (wxPython). Parent-child
 ownership might be smart in C++, but in Python we have a garbage
 collector.

 I wonder - what do you think of GTK+?
 I've only used Qt with C++, and I've always been highly suspicious of wx
 (something about the API, or the documentation… I haven't had a look at
 it in a long time), but I always found PyGTK quite nice.

GTK+ doesn't work well at all on Mac, so if cross-platform includes
Macs, it's not a contender.

To quote the gtk-osx.sourceforge.net page:

   Developers considering GTK+ as a cross-platform environment for new
   work are advised to evaluate other toolkits carefully before
   committing to GTK if they consider OSX an important market.

From experience, GTK apps are pretty awful on OSX.

-Johann
-- 
http://mail.python.org/mailman/listinfo/python-list