[issue13670] Increase test coverage for pstats.py

2019-03-16 Thread andrea crotti


andrea crotti  added the comment:

It has been a long time but if it's still useful sure.

I can see some tests have been added in commit 
863b1e4d0e95036bca4e97c1b8b2ca72c19790fb
but if these are still relevant I'm happy to go ahead.

--

___
Python tracker 
<https://bugs.python.org/issue13670>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [OT] Usage of U+00B6 PILCROW SIGN (was: generator slides review and Python doc (+/- text bug))

2014-02-04 Thread andrea crotti
2014-02-04  wxjmfa...@gmail.com:
 Le mardi 4 février 2014 15:39:54 UTC+1, Jerry Hill a écrit :


 Useless and really ugly.


I think this whole discussion is rather useless instead, why do you
care since you're not going to use this tool anyway?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review and Python doc (+/- text bug)

2014-02-03 Thread andrea crotti
2014-02-03  wxjmfa...@gmail.com:
 generator slides review and Python doc


 I do not know what tool is used to produce such
 slides.

 When the mouse is over a a text like a title (H* ... \H* ???)
 the text get transformed and a colored eol is appearing.

 Example with the slide #3:

 Even numbers
 becomes
 Even numbers§

 with a visible colored §, 'SECTION SIGN'


 I noticed the same effect with the Python doc
 since ? (long time).

 Eg.

 The Python Tutorial
 appears as
 The Python Tutorial¶

 with a visible colored ¶, 'PILCROW SIGN',
 blueish in Python 3, red in Python 2.7.6.


 And in plenty third party Python docs using
 probaly the same tool as the official Python
 doc.
 The eol glyph may vary and may not be a § or a ¶.

 Windows, Firefox and others.

 The .chm files do not seem to be affected.

 jmf
 --
 https://mail.python.org/mailman/listinfo/python-list

I just saw now this mail you didn't reply to my email correctly..
Anyway I use this:
https://github.com/nyergler/hieroglyph
And I just use sphinx + RST to generate the slides, the raw source is here:
https://raw2.github.com/AndreaCrotti/generators/master/index.rst
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-03 Thread andrea crotti
2014-02-03 Terry Reedy tjre...@udel.edu:
 On 2/2/2014 5:40 AM, andrea crotti wrote:

 In general, use assert (== AssertionError) to check program logic (should
 never raise). Remember that assert can be optimized away. Use other
 exceptions to check user behavior. So I believe that ValueError is
 appropriate here. I think I also questioned the particular check.


Yes that's right thanks fixed it


 'Generator functions', which you labeled 'generators', are functions, not
 iterators. The generators they return (and the generators that generator
 expressions evaluate to) are iterators, and more.

 type(a for a in 'abc')
 class 'generator'

 I am not sure whether 'specialized' or 'generalized' is the better term.

Well it's true that they are functions, but they also behaved
differently than functions.
I've read there has been some debate at that time whether to create a
different keyword to define generators or not, in the end they didn't
but it wouldn't be wrong imho..



 This was mainly to explain how something like
 for el in [1, 2, 3]:
  print(el)

 can work,


 But it is no longer has that *does* work. All the builtin xyz collection
 classes have a corresponding xyz_iterator class with a __next__ method that
 knows how to sequentially access collection items. We do not normally see or
 think about them, but they are there working for us every time we do 'for
 item in xyz_instance:'

 [].__iter__()
 list_iterator object at 0x035096A0

 In Python one could write the following:

 class list_iterator:
   def __init__(self, baselist):
 self.baselist = baselist
 self.index = -1  # see __next__ for why

   def __iter__(self):
 return self
   def __next__(self):
 self.index += 1
 return self.baselist[self.index]

Yes maybe that's a much better example to show thank you.



 Yes this is intentionally buggy. The thing is that I wanted to show
 that sometimes generating things makes it harder to debug, and delays
 some errors, which are anyway there but would come up immediately in
 case of a list creation.
 I could not find a better non artificial example for this, any
 suggestion is welcome..


 slide 1
 -
 def recip_list(start, stop):
   lis []
   for i range(start, stop):
 list.append(1/i)
   return lis

 for x in recip_list(-100, 3):  # fail here
   print x

 immediate traceback that include the for line

 slide 2
 ---
 def recip_gen(start, stop):
   for i in range(start, stop):
 yield 1/i


 for x in recip_gen(-100, 3):
   print x  # fail here after printing 100 lines
 ...
 delayed traceback that omits for line with args that caused problem


That's already better, another thing which I just thought about could
be this (which actually happened a few times):

def original_gen():
count = 0
while count  10:
yield count
count += 1


def consumer():
gen = original_gen()
# lis = list(gen)
for n in gen:
print(n * 2)

if I uncomment the line with lis = list(gen)
it won't print anything anymore, because we have to make sure we only
loop over ONCE.
That maybe is a better example of possible drawback? (well maybe not a
drawback but a potential common mistake)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-02 Thread andrea crotti
2014-02-02 Terry Reedy tjre...@udel.edu:
 On 2/1/2014 9:12 AM, andrea crotti wrote:

 Comments:

 The use is assert in the first slide seem bad in a couple of different
 respects.


Why is it bad? It's probably not necessary but since we ask for a
range it might be good to check if the range is valid.
Maybe I should raise ValueError instead for a better exception?

 The use of 'gen_even' before it is defined.


Well this is because l'm saying that I wish I had something like this,
which I define just after. It might be confusing if it's not defined
but I thought it's nice to say what I would like to do and then
actually define it, what do you think?


 A generator expression evaluates (better than 'yields') to a generator, not
 just an iterator.


Ok thanks fixed

 The definition of 'generator' copies the wrong and confused glossary entry.
 Generator functions return generators, which are iterators with extra
 behavior.


I understood instead that it was the opposite, a generator is a
specialized iterator, so it would be still correct that it returns an
iterator, is that wrong?

 I would leave out For loop(2). The old pseudo-getitem iterator protocol is
 seldom explicitly used any more, in the say you showed.


This was mainly to explain how something like
for el in [1, 2, 3]:
print(el)

can work, assuming defining an object list-like that just implements
__getitem__.
It's not probably how it's implemented for lists but I thought it
could clarify things..

 In 'Even numbers', I have no idea what the complication of next_even() is
 about.

Just because I wanted to find a simple way to get the next even (in
case I pass an odd start number).
I could also do inline but I thought it was more clear..

 'Lazyness drawbacks' overflow_list is bizarre and useless.  overflow_gen is
 bizarre and buggy. If you are intentionally writing buggy code to make a
 point, label it as such on the slide.


Yes this is intentionally buggy. The thing is that I wanted to show
that sometimes generating things makes it harder to debug, and delays
some errors, which are anyway there but would come up immediately in
case of a list creation.
I could not find a better non artificial example for this, any
suggestion is welcome..


 Iterators just produce values. Generators can consume as well as produce
 values, which is why they can act as both iterators and coroutines.


Well is not more clear to call them in a different way since they do
quite a different job as coroutines or generators? (I see this done
quite often)

Thanks a lot for the great feedback
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-02 Thread andrea crotti
2014-02-01 Miki Tebeka miki.teb...@gmail.com:

 My 2 cents:

 slide 4:
 [i*2 for i in range(10)]


Well this is not correct in theory because the end should be the max
number, not the number of elements.
So it should be
[i*2 for i in range(10/2)] which might be fine but it's not really
more clear imho..

 slide 9:
 while True:
 try:
 it = next(g)
 body(it)
 except StopIteration:
 break


Changed it thanks

 slide 21:
 from itertools import count, ifilterfalse

 def divided_by(p):
 return lambda n: n % p == 0

 def primes():
 nums = count(2)
 while True:
 p = next(nums)
 yield p
 nums = ifilterfalse(divided_by(p), nums)


Thank you that's nicer, but ifiilterfalse is not in Python 3 (could
use filter of course).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-02 Thread andrea crotti
The slides are updated now

2014-02-02 andrea crotti andrea.crott...@gmail.com:
 2014-02-01 Miki Tebeka miki.teb...@gmail.com:

 My 2 cents:

 slide 4:
 [i*2 for i in range(10)]


 Well this is not correct in theory because the end should be the max
 number, not the number of elements.
 So it should be
 [i*2 for i in range(10/2)] which might be fine but it's not really
 more clear imho..

 slide 9:
 while True:
 try:
 it = next(g)
 body(it)
 except StopIteration:
 break


 Changed it thanks

 slide 21:
 from itertools import count, ifilterfalse

 def divided_by(p):
 return lambda n: n % p == 0

 def primes():
 nums = count(2)
 while True:
 p = next(nums)
 yield p
 nums = ifilterfalse(divided_by(p), nums)


 Thank you that's nicer, but ifiilterfalse is not in Python 3 (could
 use filter of course).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-02 Thread andrea crotti
Sorry left too early, the slides are updated with the fixes suggested,
thanks everyone.
https://dl.dropboxusercontent.com/u/3183120/talks/generators/index.html#1

For me the biggest problem is still:
- to find some more interesting example that is easy enough to explain
- to find a better order in which explain things, to tell a clear story in a way

2014-02-02 andrea crotti andrea.crott...@gmail.com:
 The slides are updated now

 2014-02-02 andrea crotti andrea.crott...@gmail.com:
 2014-02-01 Miki Tebeka miki.teb...@gmail.com:

 My 2 cents:

 slide 4:
 [i*2 for i in range(10)]


 Well this is not correct in theory because the end should be the max
 number, not the number of elements.
 So it should be
 [i*2 for i in range(10/2)] which might be fine but it's not really
 more clear imho..

 slide 9:
 while True:
 try:
 it = next(g)
 body(it)
 except StopIteration:
 break


 Changed it thanks

 slide 21:
 from itertools import count, ifilterfalse

 def divided_by(p):
 return lambda n: n % p == 0

 def primes():
 nums = count(2)
 while True:
 p = next(nums)
 yield p
 nums = ifilterfalse(divided_by(p), nums)


 Thank you that's nicer, but ifiilterfalse is not in Python 3 (could
 use filter of course).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator slides review

2014-02-02 Thread andrea crotti
Thanks everyone for your feedback.
The talk I think went well, maybe I was too fast because I only used 21 minutes.

From the audience feedback, there were some questions about my Buggy
code example, so yes probably it's not a good example since it's too
artificial.

I'll have to find something more useful about that or just skip this maybe.
For possible generators drawbacks though I could add maintanability,
if you start passing generators around in 3-4 nested levels finding
out what is the original source of can be difficult.

I'm also still not convinced by the definitions, which I tried now to
make clear and ay something like:
- and iterator defines *how you iterate* over an object (with the
__next__ method)
- an iterable defines *if you can iterate* over an object (with the
__iter__ method)

And when I do something like this:

class GenIterable:
def __init__(self, start=0):
self.even = start if is_even(start) else start + 1

def __iter__(self):
return self

def __next__(self):
tmp = self.even
self.even += 2
return tmp


it basically means that the a GenIterable object is iterable (because
of __iter__) and the way you iterate over it is to call the next
method on the object itself (since we return self and we define
__next__).

That seems clear enough, what do you think?
I might give this talk again so feedback is still appreciated!
-- 
https://mail.python.org/mailman/listinfo/python-list


generator slides review

2014-02-01 Thread andrea crotti
I'm giving a talk tomorrow @Fosdem about generators/iterators/iterables..

The slides are here (forgive the strange Chinese characters):
https://dl.dropboxusercontent.com/u/3183120/talks/generators/index.html#3

and the code I'm using is:
https://github.com/AndreaCrotti/generators/blob/master/code/generators.py
and the tests:
https://github.com/AndreaCrotti/generators/blob/master/code/test_generators.py

If anyone has any feedback or want to point out I'm saying something
stupid I'd love to hear it before tomorrow (or also later I might give
this talk again).
Thanks
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to avoid spaghetti in Python?

2014-01-21 Thread andrea crotti
2014/1/21 CM cmpyt...@gmail.com:
 I've been learning and using Python for a number of years now but never 
 really go particularly disciplined about all good coding practices.  I've 
 definitely learned *some*, but I'm hoping this year to take a good step up in 
 terms of refactoring, maintainability, and mostly just de-spaghettizing my 
 approach to writing Python programs.


It's not really a problem of Python, you just want to learn more about
OO principles and good design practices, and about that there are
hundreds of good books to read!


 A few specific questions in this area...

 1) One of my main spaghetti problems is something I don't know what to ever 
 call.  Basically it is that I sometimes have a chain of functions or 
 objects that get called in some order and these functions may live in 
 different modules and the flow of information may jump around quite a bit, 
 sometimes through 4-5 different parts of the code, and along the way 
 variables get modified (and those variables might be child objects of the 
 whole class, or they may just be objects that exist only within functions' 
 namespaces, or both).  This is hard to debug and maintain.  What would people 
 recommend to manage this?  A system of notes?  A way to simplify the flow?  
 And what is this problem called (if something other than just spaghetti code) 
 so I can Google more about it?


Just define clearly objects and methods and how they interact with
each other in a logic way, and you won't have this problem anymore.

 2) A related question:  Often I find there are two ways to update the value 
 of an object, and I don't know what's best in which circumstances... To begin 
 to explain, let's say the code is within a class that represents a Frame 
 object in a GUI, like wxPython.  Now let's say ALL the code is within this 
 wxFrame class object.  So, any object that is named with self. prepended, 
 like self.panel, or self.current_customer, or self.current_date, will be a 
 child object of that frame class, and therefore is sort of global to the 
 whole frame's namespace and can therefore be accessed from within any 
 function in the class. So let's say I have a function called 
 self.GetCurrentCustomer().  To actually get the name of the current customer 
 into RAM, it goes into the database and uses some rule to get the current 
 customer.  NOW, the question is, which of these should I do?  This:

   def GetCurrentCustomer(self):
   self.current_customer = #do some database stuff here

 Or this:

   def GetCurrentCustomer(self):
   current_customer = #do some database stuff here
   return current_customer

 And what difference does it make?  In the first case, I am just updating the 
 global object of the current_customer, so that any function can then use 
 it.  In the second case, I am only returning the current_customer to whatever 
 function(s) call this GetCurrentCustomer() function.


GetCurrentCustomer should be really get_current_customer if you don't
want people screaming at you.
And about the question it depends, is the database stuff going to be expensive?
Do you need to have always a new value?

And by the way  if you're never actually using self in a method
maybe it should be a function, or at least a classmethod instead.

 My hunch is the first way leads to spaghetti problems.  But I want to 
 understand what the best practices are in this regard. I have found in some 
 cases the first method seemed handy, but I'm not sure what the best way of 
 thinking about this is.

 3) Generally, what are other tools or approaches you would use to organize 
 well a good-sized project so to avoid fighting yourself later when you don't 
 understand your own code and the flow of information through it?  By good 
 sized, say about 20,000 lines of Python code, or something like that.


Good architecture and some meaningful directory structure is good
enough, to navigate Emacs + ack and I'm already very productive even
with bigger projects than that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: lxml tostring quoting too much

2013-08-07 Thread andrea crotti
2013/8/6 Chris Down ch...@chrisdown.name:
 On 2013-08-06 18:38, andrea crotti wrote:
 I would really like to do the following:

 from lxml import etree as ET
 from lxml.builder import E

 url = http://something?x=10y=20;
 l = E.link(url)
 ET.tostring(l) - linkhttp://something?x=10y=20/link

 However the lxml tostring always quotes the , I can't find a way to
 tell it to avoid quoting it.

 You're probably aware, but without the escaping, it is no longer well formed
 XML. Why do you want to do that? Is there a larger underlying problem that
 should be solved instead?

 Either way, you can use unescape from the xml.sax.saxutils module[0].

 Chris

 0: http://docs.python.org/2/library/xml.sax.utils.html

Yes I know it's not correct, I thought that I still had to send that
anyway but luckily the problem was somewhere else, so encoding was
actually necessary and I don't need to do something strange..
-- 
http://mail.python.org/mailman/listinfo/python-list


lxml tostring quoting too much

2013-08-06 Thread andrea crotti
I would really like to do the following:

from lxml import etree as ET
from lxml.builder import E

url = http://something?x=10y=20;
l = E.link(url)
ET.tostring(l) - linkhttp://something?x=10y=20/link

However the lxml tostring always quotes the , I can't find a way to
tell it to avoid quoting it.
Is it possible?
Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti
Using a CouchDB server we have a different database object potentially for
every request.

We already set that db in the request object to make it easy to pass it
around form our django app, however it would be nice if I could set it once
in the API and automatically fetch it from there.

Basically I have something like

class Entity:
 def save_doc(db)
...

I would like basically to decorate this function in such a way that:
- if I pass a db object use it
- if I don't pass it in try to fetch it from a global object
- if both don't exist raise an exception

Now it kinda of works already with the decorator below.
The problem is that the argument is positional so I end up maybe passing it
twice.
So I have to enforce that 'db' if there is passed as first argument..

It would be a lot easier removing the db from the arguments but then it
would look too magic and I didn't want to change the signature.. any other
advice?

def with_optional_db(func):
Decorator that sets the database to the global current one if
not passed in or if passed in and None

@wraps(func)
def _with_optional_db(*args, **kwargs):
func_args = func.func_code.co_varnames
db = None
# if it's defined in the first elements it needs to be
# assigned to *args, otherwise to kwargs
if 'db' in func_args:
assert 'db' == func_args[0], Needs to be the first defined
else:
db = kwargs.get('db', None)

if db is None:
kwargs['db'] = get_current_db()

assert kwargs['db'] is not None, Can't have a not defined database
ret = func(*args, **kwargs)
return ret

return _with_optional_db
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python-django for dynamic web survey?

2013-06-18 Thread andrea crotti
Django makes your life a lot easier in many ways, but you still need some
time to learn it.
The task you're trying it's not trivial though, depending on your
experience it might take a while with any library/framework..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti
2013/6/18 Wolfgang Maier wolfgang.ma...@biologie.uni-freiburg.de

 andrea crotti andrea.crotti.0 at gmail.com writes:

 
 
  Using a CouchDB server we have a different database object potentially
 for
 every request.
 
  We already set that db in the request object to make it easy to pass it
 around form our django app, however it would be nice if I could set it once
 in the API and automatically fetch it from there.
 
  Basically I have something like
 
  class Entity:
   def save_doc(db)
  ...
 
  I would like basically to decorate this function in such a way that:
  - if I pass a db object use it
  - if I don't pass it in try to fetch it from a global object
  - if both don't exist raise an exception
 
  Now it kinda of works already with the decorator below.
  The problem is that the argument is positional so I end up maybe passing
 it twice.
  So I have to enforce that 'db' if there is passed as first argument..
 
  It would be a lot easier removing the db from the arguments but then it
 would look too magic and I didn't want to change the signature.. any other
 advice?
 
  def with_optional_db(func):
  Decorator that sets the database to the global current one if
  not passed in or if passed in and None
  
   at wraps(func)
  def _with_optional_db(*args, **kwargs):
  func_args = func.func_code.co_varnames
  db = None
  # if it's defined in the first elements it needs to be
  # assigned to *args, otherwise to kwargs
  if 'db' in func_args:
  assert 'db' == func_args[0], Needs to be the first defined
  else:
  db = kwargs.get('db', None)
 
  if db is None:
  kwargs['db'] = get_current_db()
 
  assert kwargs['db'] is not None, Can't have a not defined
 database
  ret = func(*args, **kwargs)
  return ret
 
  return _with_optional_db
 

 I'm not sure, whether your code would work. I get the logic for the db in
 kwargs case, but why are you checking whether db is in func_args? Isn't the
 real question whether it's in args ?? In general, I don't understand why
 you
 want to use .func_code.co_varnames here. You know how you defined your
 function (or rather method):
 class Entity:
 def save_doc(db):
 ...
 Maybe I misunderstood the problem?
 Wolfgang




 --
 http://mail.python.org/mailman/listinfo/python-list



Well the point is that I could allow someone to not use db as argument of
the function if he only wants to use the global db object..

Or at least I want to check that it's the first argument and not in another
position, just as a sanity check.

I might drop some magic and make it a bit simpler though, even the default
argument DEFAULT_DB could be actually good, and I would not even need the
decorator at that point..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti
2013/6/18 Terry Reedy tjre...@udel.edu

 On 6/18/2013 5:47 AM, andrea crotti wrote:

 Using a CouchDB server we have a different database object potentially
 for every request.

 We already set that db in the request object to make it easy to pass it
 around form our django app, however it would be nice if I could set it
 once in the API and automatically fetch it from there.

 Basically I have something like

 class Entity:
   def save_doc(db)


 If save_doc does not use an instance of Entity (self) or Entity itself
 (cls), it need not be put in the class.


I missed a self it's a method actually..




   ...

 I would like basically to decorate this function in such a way that:
 - if I pass a db object use it
 - if I don't pass it in try to fetch it from a global object
 - if both don't exist raise an exception


 Decorators are only worthwhile if used repeatedly. What you specified can
 easily be written, for instance, as

 def save_doc(db=None):
   if db is None:
 db = fetch_from_global()
   if isinstance(db, dbclass):
 save_it()
   else:
 raise ValueError('need dbobject')



Yes that's exactly why I want a decorator, to avoid all this boilerplate
for every function method that uses a db object..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Getting a callable for any value?

2013-05-29 Thread andrea crotti

On 05/29/2013 06:46 PM, Croepha wrote:

Is there anything like this in the standard library?

class AnyFactory(object):
def __init__(self, anything):
self.product = anything
def __call__(self):
return self.product
def __repr__(self):
return %s.%s(%r) % (self.__class__.__module__, 
self.__class__.__name__, self.product)


my use case is: 
collections.defaultdict(AnyFactory(collections.defaultdict(AnyFactory(None




I think I would scratch my head for a good half an hour if I see a 
string like this, so I hope there isn't anything in the standard library 
to do that :D
-- 
http://mail.python.org/mailman/listinfo/python-list


pip and different branches?

2013-05-20 Thread andrea crotti
We use github and we work on many different branches at the same time.

The problem is that we have 5 repos now, and for each repo we might
have the same branches on all of them.

Now we use pip and install requirements such as:
git+ssh://g...@github.com/repo.git@dev

Now the problem is that the requirements file are also under revision
control, and constantly we end up in the situation that when we merge
branches the branch settings get messed up, because we forget to change
them.

I was looking for a solution for this that would allow me to:
- use the branch of the main repo for all the dependencies
- fallback on master if that branch doesn't exist

I thought about a few options:
1. create a wrapper for PIP that manipulates the requirement file, that now
   would become templates.
   In this way I would have to know however if a branch exist or not,
   and I didn't find a way to do that without cloning the repo.

2. modify PIP to not fail when checking out a non existing branch, so
   that if it's not found it falls back on master automatically.

3. use some git magic hooks but I'm not sure what exactly

4. stop using virtualenv + pip and use something smarter that handles
   this.

Any suggestions?
Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dynamic forms generation

2013-04-19 Thread andrea crotti
Well I think since we are using django anyway (and bottle on the API side)
I'm not sure why we would use flask forms for this..

Anyway the main question is probably, is it worth to try to define a DSL or
not?
The problem I see is that we have a lot and very complex requirements,
trying to define a DSL that is able to represent everything might be a
massive pain.

I prefer to do something smart with metaclasses and just use Python as the
configuration language itself, using metaclasses and similar nice things to
define the language used..
The only annoying part is the persistance, I don't like too much the idea
to store Python code in the db for example, but maybe it's fine, many
projects configure things with Python anyway..
-- 
http://mail.python.org/mailman/listinfo/python-list


dynamic forms generation

2013-04-16 Thread andrea crotti
We are re-designing a part of our codebase, which should in short be
able to generate forms with custom fields.

We use django for the frontend and bottle for the backend (using CouchDB
as database), and at the moment we simply plug extra fields on normal
django forms.

This is not really scalable, and we want to make the whole thing more
generic.

So ideally there could be a DSL (YAML or something else) that we could
define to then generate the forms, but the problem is that I'm quite
sure that this DSL would soon become too complex and inadeguate, so I'm
not sure if it's worth since noone should write forms by hands anyway.

Between the things that we should be able to do there are:
- dependent fields
- validation (both server and client side, better if client-side
  auto-generated)
- following DRY as much as possible

Any suggestions of possible designs or things I can look at?
-- 
http://mail.python.org/mailman/listinfo/python-list


Mixin way?

2013-04-03 Thread andrea crotti
I have some classes that have shared behaviours, for example in our
scenario an object can be visited, where something that is visitable
would have some behaviour like

--8---cut here---start-8---
class Visitable(Mixin):
FIELDS = {
'visits': [],
'unique_visits': 0,
}

def record_view(self, who, when):
self.visits += {'who': who, 'when': when}
self.unique_visits += 1
--8---cut here---end---8---

Where the Mixin class simply initialises the attributes:

--8---cut here---start-8---
class Mixin(object):
def __init__(self, **kwargs):
for key, val in self.FIELDS.items():
setattr(self, key, val)

for key, val in kwargs.items():
if key in self.FIELDS:
setattr(self, key, val)
--8---cut here---end---8---


So now I'm not sure how to use it though.
One way would be multiple subclasses

class MyObjectBase(object):
   pass

class MyObj(MyObjectBase, Visitable):
   pass

for example.
This solution is probably easy, but at the same time disturbing because
MyObjectBase is semantically quite different from Visitable, so
subclassing from both seems wrong..

The other solution (which is what is partially done now) is to use
another class attribute:

class ObjectWithMixin(CouchObject):
MIXINS = [Visitable]

and then do all the smart things needed:
- at object construction time
- when setting attributes and so on..

This solution is more complicated to implement but maybe is more
flexible and more correct, what do you think?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Mixin way?

2013-04-03 Thread andrea crotti
2013/4/3 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info

 [snip]

 So, if you think of Visitable as a gadget that can be strapped onto
 your MyObj as a component, then composition is probably a better design.
 But if you think of Visitable as a mere collection of behaviour and
 state, then a mixin is probably a better design.


Well I can explain better the situation to make it more clear.

We are using CouchDb and so far it has been (sigh) a brutal manipulation of
dictionaries everywhere, with code duplication and so on.

Now I wanted to encapsulate all the entities in the DB in proper objects,
so I have a CouchObject:


class CouchObject(object) :

Encapsulate an object which has the ability to be saved to a couch
database.

#: list of fields that get filled in automatically if not passed in
AUTO = ['created_datetime', 'doc_type', '_id', '_rev']
#: dictionary with some extra fields with default values if not
# passed in the constructor the default value gets set to the attribute
DEFAULTS = {}
REQUIRED = []
OPTIONAL = []
TO_RESOLVE = []
MIXINS = []

Where every subclass can redefine these attributes to get something
done automatically by the constructor for convenience.

Now however there is a lot of behaviour shared between them, so I want
to encapsulate it out in different places.

I think the MIXINS as I would use it is the normal composition
pattern, the only difference is that the composition is done per class
and not per object (again for lazyness reasons), right?

Probably subclassing might be fine as well, and makes it simpler, but
I don't like too much to do subclass from multiple classes...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: groupby behaviour

2013-02-26 Thread andrea crotti
2013/2/26 Ian Kelly ian.g.ke...@gmail.com:
 On Tue, Feb 26, 2013 at 9:27 AM, andrea crotti
 andrea.crott...@gmail.com wrote:
 So I was trying to use groupby (which I used in the past), but I
 noticed a very strange thing if using list on
 the result:

 As stated in the docs:

 
 The returned group is itself an iterator that shares the underlying
 iterable with groupby(). Because the source is shared, when the
 groupby() object is advanced, the previous group is no longer visible.
 So, if that data is needed later, it should be stored as a list:
 
 --
 http://mail.python.org/mailman/listinfo/python-list


I should have read more carefully sorry, I was in the funny situation
where it would have actually worked in the production code but it was
failing in the unit tests (because I was using list only there).

It's very weird though this sharing and still doesn't really look
rightl, is it done just for performance reasons?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: running multiple django/bottle instances

2013-01-07 Thread andrea crotti
Not really, on the staging server we are using the django/bottle webserver..

Anyway I was thinking that a great possible solution might be to set
up something like buildbot to:
- checkout all the needed branches

- run the various servers for all of them on different ports, where maybe the
  mapping  port-branch is set from somewhere

- run all the unit tests for them and make the results available

With a similar configuration we would probably be very happy already,
anyone doing something similar?

2013/1/7 Michel Kunkler michel.kunk...@gmail.com:
 As you are certainly running a production server like Apache, your problem
 is actually not Python related.
 If you want to run your applications on different ports, take a look on e.g.
 Apaches virtual host configurations.
 http://httpd.apache.org/docs/2.2/vhosts/examples.html

 Am 03.01.2013 17:35, schrieb Andrea Crotti:

 I'm working on a quite complex web app that uses django and bottle
 (bottle for the API which is also restful).

 Before I came they started to use a staging server to be able to try out
 things properly before they get published, but now we would like to have
 the possibility to see multiple branches at a time.

 First we thought about multiple servers, but actually since bottle and
 django can be made to run on different ports, I thought why not running
 everything on one server on different ports?

 We also use elasticsearch and couchdb for the data, but these two
 don't change that much and can just be a single instance.

 So what would be really great could be

 staging_server/branch_x
 staging_server/branch_y

 and something keeps track of all the various branches tracked, and run
 or keeps running bottle/django on different ports for the different
 branches.

 Is there something in the wonderful python world which I could bend to
 my needs?

 I'll probably have to script something myself anyway, but any
 suggestions is welcome, since I don't have much experience with web
 stuff..


-- 
http://mail.python.org/mailman/listinfo/python-list


running multiple django/bottle instances

2013-01-03 Thread Andrea Crotti

I'm working on a quite complex web app that uses django and bottle
(bottle for the API which is also restful).

Before I came they started to use a staging server to be able to try out
things properly before they get published, but now we would like to have
the possibility to see multiple branches at a time.

First we thought about multiple servers, but actually since bottle and
django can be made to run on different ports, I thought why not running
everything on one server on different ports?

We also use elasticsearch and couchdb for the data, but these two
don't change that much and can just be a single instance.

So what would be really great could be

staging_server/branch_x
staging_server/branch_y

and something keeps track of all the various branches tracked, and run
or keeps running bottle/django on different ports for the different
branches.

Is there something in the wonderful python world which I could bend to
my needs?

I'll probably have to script something myself anyway, but any
suggestions is welcome, since I don't have much experience with web stuff..
--
http://mail.python.org/mailman/listinfo/python-list


Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti
Yes I wanted to avoid to do something too complex, anyway I'll just
comment it well and add a link to the original code..

But this is now failing to me:

def daemonize(stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
# Perform first fork.
try:
pid = os.fork()
if pid  0:
sys.exit(0) # Exit first parent.
except OSError as e:
sys.stderr.write(fork #1 failed: (%d) %s\n % (e.errno, e.strerror))
sys.exit(1)

# Decouple from parent environment.
os.chdir(/)
os.umask(0)
os.setsid()

# Perform second fork.
try:
pid = os.fork()
if pid  0:
sys.exit(0) # Exit second parent.
except OSError, e:
sys.stderr.write(fork #2 failed: (%d) %s\n % (e.errno, e.strerror))
sys.exit(1)

# The process is now daemonized, redirect standard file descriptors.
sys.stdout.flush()
sys.stderr.flush()

si = file(stdin, 'r')
so = file(stdout, 'a+')
se = file(stderr, 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())


if __name__ == '__main__':
daemonize(stdout='sample_file', stderr='sample')
print(hello world, now should be the child!)


[andrea@andreacrotti experiments]$ python2 daemon.py
Traceback (most recent call last):
  File daemon.py, line 49, in module
daemonize(stdout='sample_file', stderr='sample')
  File daemon.py, line 41, in daemonize
so = file(stdout, 'a+')
IOError: [Errno 13] Permission denied: 'sample_file'

The parent process can write to that file easily, but the child can't,
why is it working for you and not for me though?
(Running this on Linux with a non-root user)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti
Ah sure that makes sense!

But actually why do I need to move away from the current directory of
the parent process?
In my case it's actually useful to be in the same directory, so maybe
I can skip that part,
or otherwise I need another chdir after..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti
2012/12/11 peter pjmak...@gmail.com:
 On 12/11/2012 10:25 AM, andrea crotti wrote:

 Ah sure that makes sense!

 But actually why do I need to move away from the current directory of
 the parent process?
 In my case it's actually useful to be in the same directory, so maybe
 I can skip that part,
 or otherwise I need another chdir after..

 You don't need to move away from the current directory. You cant use os to
 get the current work directory

 stderrfile = '%s/error.log' % os.getcwd()
 stdoutfile = '%s/out.log' % os.getcwd()

 then call the daemon function like this.

 daemonize(stdout=stdoutfile, stderr=stderrfile)


But the nice thing now is that all the forked processes log also on
stdout/stderr, so if I launch the parent in the shell I get

DEBUG - csim_flow.worker[18447]: Moving the log file
/user/sim/tests/batch_pdump_records/running/2012_12_11_13_4_10.log to
/user/sim/tests/batch_pdump_records/processed/2012_12_11_13_4_10.log

DEBUG - csim_flow.area_manager[19410]: No new logs found in
/user/sim/tests/batch_pdump_records

where in [] I have the PID of the process.
In this suggested way I should use some other files as standard output
and error, but for that I already have the logging module that logs
in the right place..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti
2012/12/11 Jean-Michel Pichavant jeanmic...@sequans.com:
 - Original Message -
 So I implemented a simple decorator to run a function in a forked
 process, as below.

 It works well but the problem is that the childs end up as zombies on
 one machine, while strangely
 I can't reproduce the same on mine..

 I know that this is not the perfect method to spawn a daemon, but I
 also wanted to keep the code
 as simple as possible since other people will maintain it..

 What is the easiest solution to avoid the creation of zombies and
 maintain this functionality?
 thanks


 def on_forked_process(func):
 from os import fork
 Decorator that forks the process, runs the function and gives
 back control to the main process
 
 def _on_forked_process(*args, **kwargs):
 pid = fork()
 if pid == 0:
 func(*args, **kwargs)
 _exit(0)
 else:
 return pid

 return _on_forked_process
 --
 http://mail.python.org/mailman/listinfo/python-list


 Ever though about using the 'multiprocessing' module? It's a slightly higher 
 API and I don't have issues with zombie processes.
 You can combine this with a multiprocess log listener so that all logs are 
 sent to the main process.

 See Vinay Sajip's code about multiprocessing and logging, 
 http://plumberjack.blogspot.fr/2010/09/using-logging-with-multiprocessing.html

 I still had to write some cleanup code before leaving the main process, but 
 once terminate is called on all remaining subprocesses, I'm not left with 
 zombie processes.
 Here's the cleaning:

 for proc in multiprocessing.active_children():
 proc.terminate()

 JM


 -- IMPORTANT NOTICE:

 The contents of this email and any attachments are confidential and may also 
 be privileged. If you are not the intended recipient, please notify the 
 sender immediately and do not disclose the contents to any other person, use 
 it for any purpose, or store or copy the information in any medium. Thank you.


Yes I thought about that but I want to be able to kill the parent
without killing the childs, because they can run for a long time..

Anyway I got something working now with this

def daemonize(func):

def _daemonize(*args, **kwargs):
# Perform first fork.
try:
pid = os.fork()
if pid  0:
sys.exit(0) # Exit first parent.
except OSError as e:
sys.stderr.write(fork #1 failed: (%d) %s\n % (e.errno,
e.strerror))
sys.exit(1)

# Decouple from parent environment.
# check if decoupling here makes sense in our case
# os.chdir(/)
# os.umask(0)
# os.setsid()

# Perform second fork.
try:
pid = os.fork()
if pid  0:
return pid

except OSError, e:
sys.stderr.write(fork #2 failed: (%d) %s\n % (e.errno,
e.strerror))
sys.exit(1)

# The process is now daemonized, redirect standard file descriptors.
sys.stdout.flush()
sys.stderr.flush()
func(*args, **kwargs)

return _daemonize


@daemonize
def long_smarter_process():
while True:
sleep(2)
print(Hello how are you?)


And it works exactly as before, but more correctly..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti
2012/12/11 Dennis Lee Bieber wlfr...@ix.netcom.com:
 On Tue, 11 Dec 2012 10:34:23 -0300, peter pjmak...@gmail.com declaimed
 the following in gmane.comp.python.general:


 stderrfile = '%s/error.log' % os.getcwd()
 stdoutfile = '%s/out.log' % os.getcwd()

 Ouch...

 stdoutfile = os.path.join(os.getcwd(), out.log)

 minimizes any OS specific quirks in path naming...
 --
 Wulfraed Dennis Lee Bieber AF6VN
 wlfr...@ix.netcom.comHTTP://wlfraed.home.netcom.com/

 --
 http://mail.python.org/mailman/listinfo/python-list


Good point yes, but in this case fork doesn't work on Windows anyway
so it's not really an issue..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-14 Thread andrea crotti
2012/11/14 Kushal Kumaran kushal.kumaran+pyt...@gmail.com:

 Well, well, I was wrong, clearly.  I wonder if this is fixable.

 --
 regards,
 kushal
 --
 http://mail.python.org/mailman/listinfo/python-list

But would it not be possible to use the pipe in memory in theory?
That would be way faster and since I have in theory enough RAM it
might be a great improvement..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-14 Thread andrea crotti
2012/11/14 Dave Angel d...@davea.name:
 On 11/14/2012 10:56 AM, andrea crotti wrote:
 Ok this is all very nice, but:

 [andrea@andreacrotti tar_baller]$ time python2 test_pipe.py  /dev/null

 real  0m21.215s
 user  0m0.750s
 sys   0m1.703s

 [andrea@andreacrotti tar_baller]$ time ls -lR /home/andrea | cat  /dev/null

 real  0m0.986s
 user  0m0.413s
 sys   0m0.600s

 snip


 So apparently it's way slower than using this system, is this normal?

 I'm not sure how this timing relates to the thread, but what it mainly
 shows is that starting up the Python interpreter takes quite a while,
 compared to not starting it up.


 --

 DaveA



Well it's related because my program has to be as fast as possible, so
in theory I thought that using Python pipes would be better because I
can get easily the PID of the first process.

But if it's so slow than it's not worth, and I don't think is the
Python interpreter because it's more or less constantly many times
slower even changing the size of the input..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-14 Thread Andrea Crotti

On 11/14/2012 04:33 PM, Dave Angel wrote:

Well, as I said, I don't see how the particular timing has anything to
do with the rest of the thread.  If you want to do an ls within a Python
program, go ahead.  But if all you need can be done with ls itself, then
it'll be slower to launch python just to run it.

Your first timing runs python, which runs two new shells, ls, and cat.
Your second timing runs ls and cat.

So the difference is starting up python, plus starting the shell two
extra times.

I'd also be curious if you flushed the system buffers before each
timing, as the second test could be running entirely in system memory.
And no, I don't know offhand how to flush them in Linux, just that
without it, your timings are not at all repeatable.  Note the two
identical runs here.

davea@think:~/temppython$ time ls -lR ~ | cat  /dev/null

real0m0.164s
user0m0.020s
sys 0m0.000s
davea@think:~/temppython$ time ls -lR ~ | cat  /dev/null

real0m0.018s
user0m0.000s
sys 0m0.010s

real time goes down by 90%, while user time drops to zero.
And on a 3rd and subsequent run, sys time goes to zero as well.



Right I didn't think about that..
Anyway the only thing I wanted to understand is if using the pipes in 
subprocess is exactly the same as doing

the Linux pipe, or not.

And any idea on how to run it in ram?
Maybe if I create a pipe in tmpfs it might already work, what do you think?
--
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-08 Thread andrea crotti
2012/11/7 Oscar Benjamin oscar.j.benja...@gmail.com:

 Correct. But if you read the rest of Alexander's post you'll find a
 suggestion that would work in this case and that can guarantee to give
 files of the desired size.

 You just need to define your own class that implements a write()
 method and then distributes any data it receives to separate files.
 You can then pass this as the fileobj argument to the tarfile.open
 function:
 http://docs.python.org/2/library/tarfile.html#tarfile.open


 Oscar



Yes yes I saw the answer, but now I was thinking that what I need is
simply this:
tar czpvf - /path/to/archive | split -d -b 100M - tardisk

since it should run only on Linux it's probably way easier, my script
will then only need to create the list of files to tar..

The only doubt is if this is more or less reliably then doing it in
Python, when can this fail with some bad broken pipe?
(the filesystem is not very good as I said and it's mounted with NFS)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-08 Thread andrea crotti
2012/11/8 andrea crotti andrea.crott...@gmail.com:



 Yes yes I saw the answer, but now I was thinking that what I need is
 simply this:
 tar czpvf - /path/to/archive | split -d -b 100M - tardisk

 since it should run only on Linux it's probably way easier, my script
 will then only need to create the list of files to tar..

 The only doubt is if this is more or less reliably then doing it in
 Python, when can this fail with some bad broken pipe?
 (the filesystem is not very good as I said and it's mounted with NFS)

In the meanwhile I tried a couple of things, and using the pipe on
Linux actually works very nicely, it's even faster than simple tar for
some reasons..

[andrea@andreacrotti isos]$ time tar czpvf - file1.avi file2.avi |
split -d -b 1000M - inchunks
file1.avi
file2.avi

real1m39.242s
user1m14.415s
sys 0m7.140s

[andrea@andreacrotti isos]$ time tar czpvf total.tar.gz file1.avi file2.avi
file1.avi
file2.avi

real1m41.190s
user1m13.849s
sys 0m5.723s

[andrea@andreacrotti isos]$ time split -d -b 1000M total.tar.gz inchunks

real0m55.282s
user0m0.020s
sys 0m3.553s
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: creating size-limited tar files

2012-11-07 Thread Andrea Crotti

On 11/07/2012 08:32 PM, Roy Smith wrote:

In article 509ab0fa$0$6636$9b4e6...@newsspool2.arcor-online.net,
  Alexander Blinne n...@blinne.net wrote:


I don't know the best way to find the current size, I only have a
general remark.
This solution is not so good if you have to impose a hard limit on the
resulting file size. You could end up having a tar file of size limit +
size of biggest file - 1 + overhead in the worst case if the tar is at
limit - 1 and the next file is the biggest file. Of course that may be
acceptable in many cases or it may be acceptable to do something about
it by adjusting the limit.

If you truly have a hard limit, one possible solution would be to use
tell() to checkpoint the growing archive after each addition.  If adding
a new file unexpectedly causes you exceed your hard limit, you can
seek() back to the previous spot and truncate the file there.

Whether this is worth the effort is an exercise left for the reader.


So I'm not sure if it's an hard limit or not, but I'll check tomorrow.
But in general for the size I could also take the size of the files and 
simply estimate the size of all of them,

pushing as many as they should fit in a tarfile.
With compression I might get a much smaller file maybe, but it would be 
much easier..


But the other problem is that at the moment the people that get our 
chunks reassemble the file with a simple:


cat file1.tar.gz file2.tar.gz  file.tar.gz

which I suppose is not going to work if I create 2 different tar files, 
since it would recreate the header in all of the them, right?
So or I give also a script to reassemble everything or I have to split 
in a more brutal way..


Maybe after all doing the final split was not too bad, I'll first check 
if it's actually more expensive for the filesystem (which is very very slow)

or it's not a big deal...
--
http://mail.python.org/mailman/listinfo/python-list


Re: accepting file path or file object?

2012-11-05 Thread andrea crotti
2012/11/5 Peter Otten __pete...@web.de:
 I sometimes do something like this:

 $ cat xopen.py
 import re
 import sys
 from contextlib import contextmanager

 @contextmanager
 def xopen(file=None, mode=r):
 if hasattr(file, read):
 yield file
 elif file == -:
 if w in mode:
 yield sys.stdout
 else:
 yield sys.stdin
 else:
 with open(file, mode) as f:
 yield f

 def grep(stream, regex):
 search = re.compile(regex).search
 return any(search(line) for line in stream)

 if len(sys.argv) == 1:
 print grep([alpha, beta, gamma], gamma)
 else:
 with xopen(sys.argv[1]) as f:
 print grep(f, sys.argv[2])
 $ python xopen.py
 True
 $ echo 'alpha beta gamma' | python xopen.py - gamma
 True
 $ echo 'alpha beta gamma' | python xopen.py - delta
 False
 $ python xopen.py xopen.py context
 True
 $ python xopen.py xopen.py gamma
 True
 $ python xopen.py xopen.py delta
 False
 $


 --
 http://mail.python.org/mailman/listinfo/python-list

That's nice thanks, there is still the problem of closing the file
handle but that's maybe not so important if it gets closed at
termination anyway..
-- 
http://mail.python.org/mailman/listinfo/python-list


lazy properties?

2012-11-01 Thread Andrea Crotti
Seeing the wonderful lazy val in Scala I thought that I should try to 
get the following also in Python.

The problem is that I often have this pattern in my code:

class Sample:
def __init__(self):
self._var = None

@property
def var(self):
if self._var is None:
self._var = long_computation()
else:
return self._var


which is quite useful when you have some expensive attribute to compute 
that is not going to change.
I was trying to generalize it in a @lazy_property but my attempts so far 
failed, any help on how I could do that?


What I would like to write is
@lazy_property
def var_lazy(self):
return long_computation()

and this should imply that the long_computation is called only once..
--
http://mail.python.org/mailman/listinfo/python-list


Re: Nice solution wanted: Hide internal interfaces

2012-10-30 Thread andrea crotti
2012/10/30 alex23 wuwe...@gmail.com:
 On Oct 30, 2:33 am, Johannes Bauer dfnsonfsdu...@gmx.de wrote:
 I'm currently looking for a good solution to the following problem: I
 have two classes A and B, which interact with each other and which
 interact with the user. Instances of B are always created by A.

 Now I want A to call some private methods of B and vice versa (i.e. what
 C++ friends are), but I want to make it hard for the user to call
 these private methods.

 One approach could be to only have the public interface on B, and then
 create a wrapper for B that provides the private interface:

 class B:
 def public_method(self):
 pass

 class B_Private:
 def __init__(self, context):
 self.context = context

 def private_method(self):
 # manipulate self.context

 class A:
 def __init__(self):
 self.b = B()
 self.b_private = B_Private(self.b)

 def foo(self):
 # call public method
 self.b.public_method()

 # call private method
 self.b_private.private_method()

 It doesn't stop a user from accessing the private methods, but it does
 separate them so they have to *intentionally* choose to use them.
 --
 http://mail.python.org/mailman/listinfo/python-list



Partly unrelated, but you could also define a clear API and expose it
through your __init__.py.

For example:
package/a.py:
class A: pass

package/b.py:
class B:pass

package/__init__.py
from a import A

so now doing from package import will only show A.

This doesn't work on the method-level, but it's useful to know and
commonly done in many projects..


In some projects they even use a file api.py to you have to
explicitly import

from package.api import ..
(which I think is overkill since __init__.py does the same)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Immutability and Python

2012-10-29 Thread andrea crotti
2012/10/29 Jean-Michel Pichavant jeanmic...@sequans.com:

 return NumWrapper(self.number + 1) 

 still returns a(nother) mutable object.

 So what's the point of all this ?

 JM


Well sure but it doesn't modify the first object, just creates a new
one.  There are in general good reasons to do that, for example I can
then compose things nicely:

num.increment().increment()

or I can parallelize operations safely not caring about the order of
operations.

But while I do this all the time with more functional languages, I
don't tend to do exactly the same in Python, because I have the
impression that is not worth, but maybe I'm wrong..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Immutability and Python

2012-10-29 Thread andrea crotti
2012/10/29 andrea crotti andrea.crott...@gmail.com:


 Well sure but it doesn't modify the first object, just creates a new
 one.  There are in general good reasons to do that, for example I can
 then compose things nicely:

 num.increment().increment()

 or I can parallelize operations safely not caring about the order of
 operations.

 But while I do this all the time with more functional languages, I
 don't tend to do exactly the same in Python, because I have the
 impression that is not worth, but maybe I'm wrong..


By the way on this topic there is a great talk by the creator of
Clojure: http://www.infoq.com/presentations/Value-Values
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Immutability and Python

2012-10-29 Thread andrea crotti
2012/10/29 Jean-Michel Pichavant jeanmic...@sequans.com:


 In an OOP language num.increment() is expected to modify the object in place.
 So I think you're right when you say that functional languages technics do 
 not necessarily apply to Python, because they don't.

 I would add that what you're trying to suggest in the first post was not 
 really about immutability, immutable objects in python are ... well 
 immutable, they can be used as a dict key for instance, your NumWrapper 
 object cannot.


 JM

Yes right immutable was not the right word, I meant that as a contract
with myself I'm never going to modify its state.

Also because how doi I make an immutable object in pure Python?

But the example with the dictionary is not correct though, because this:

In [145]: class C(object):
   .: def __hash__(self):
   .: return 42
   .:

In [146]: d = {C(): 1}

works perfectly, but an object of class C can mutate as much as it
wants, as my NumWrapper instance..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Immutability and Python

2012-10-29 Thread andrea crotti
2012/10/29 Chris Angelico ros...@gmail.com:
 On Tue, Oct 30, 2012 at 2:55 AM, Paul Rubin no.email@nospam.invalid wrote:
 andrea crotti andrea.crott...@gmail.com writes:
 and we want to change its state incrementing the number ...
  the immutability purists would instead suggest to do this:
 def increment(self):
 return NumWrapper(self.number + 1)

 Immutability purists would say that numbers don't have state and if
 you're trying to change a number's state by incrementing it, that's not
 immutability.  You end up with a rather different programming style than
 imperative programming, for example using tail recursion (maybe wrapped
 in an itertools-like higher-order function) instead of indexed loops to
 iterate over a structure.

 In that case, rename increment to next_integer and TYAOOYDAO. [1]
 You're not changing the state of this number, you're locating the
 number which has a particular relationship to this one (in the same
 way that GUI systems generally let you locate the next and previous
 siblings of any given object).

 ChrisA
 [1] there you are, out of your difficulty at once - cf WS Gilbert's 
 Iolanthe
 --
 http://mail.python.org/mailman/listinfo/python-list


Yes the name should be changed, but the point is that they are both
ways to implement the same thing.

For example suppose I want to have 10 objects (for some silly reason)
that represent the next number, in the first case I would do:

numbers = [NumWrapper(orig.number)] * 10
for num in numbers:
num.increment()

while in the second is as simple as:
numbers = [orig.next_number()] * 10

composing things become much easier, but as a downside it's not always
so easy and convienient to write code in this way, it probably depends
on the use case..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nice solution wanted: Hide internal interfaces

2012-10-29 Thread andrea crotti
2012/10/29 Johannes Bauer dfnsonfsdu...@gmx.de:
 Hi there,

 I'm currently looking for a good solution to the following problem: I
 have two classes A and B, which interact with each other and which
 interact with the user. Instances of B are always created by A.

 Now I want A to call some private methods of B and vice versa (i.e. what
 C++ friends are), but I want to make it hard for the user to call
 these private methods.

 Currently my ugly approach is this: I delare the internal methods
 private (hide from user). Then I have a function which gives me a
 dictionary of callbacks to the private functions of the other objects.
 This is in my opinion pretty ugly (but it works and does what I want).

 I'm pretty damn sure there's a nicer (prettier) solution out there, but
 I can't currently think of it. Do you have any hints?

 Best regards,
 Joe


And how are you declaring methods private?  Because there is no real
private attribute in Python, if you declare them with a starting _
they are still perfectly accessible..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: resume execution after catching with an excepthook?

2012-10-25 Thread andrea crotti
2012/10/25 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
 On Wed, 24 Oct 2012 13:51:30 +0100, andrea crotti wrote:

 So I would like to be able to ask for confirmation when I receive a C-c,
 and continue if the answer is N/n.

 I don't think there is any way to do this directly.

 Without a try...except block, execution will cease after an exception is
 caught, even when using sys.excepthook. I don't believe that there is any
 way to jump back to the line of code that just failed (and why would you,
 it will just fail again) or the next line (which will likely fail because
 the previous line failed).

 I think the only way you can do this is to write your own execution loop:

 while True:
 try:
 run(next_command())
 except KeyboardInterrupt:
 if confirm_quit():
 break


 Of course you need to make run() atomic, or use transactions that can be
 reverted or backed out of. How plausible this is depends on what you are
 trying to do -- Python's Ctrl-C is not really designed to be ignored.

 Perhaps a better approach would be to treat Ctrl-C as an unconditional
 exit, and periodically poll the keyboard for another key press to use as
 a conditional exit. Here's a snippet of platform-specific code to get a
 key press:

 http://code.activestate.com/recipes/577977

 Note however that it blocks if there is no key press waiting.

 I suspect that you may need a proper event loop, as provided by GUI
 frameworks, or curses.



 --
 Steven
 --
 http://mail.python.org/mailman/listinfo/python-list



Ok thanks, but here the point is not to resume something that is going
to fail again, just to avoid accidental kill of processes that take a
long time.  Probably needed only by me in debugging mode, but anyway I
can do the simple try/except then, thanks..
-- 
http://mail.python.org/mailman/listinfo/python-list


resume execution after catching with an excepthook?

2012-10-24 Thread andrea crotti
So I would like to be able to ask for confirmation when I receive a C-c,
and continue if the answer is N/n.

I'm already using an exception handler set with sys.excepthook, but I
can't make it work with the confirm_exit, because it's going to quit in
any case..

A possible solution would be to do a global try/except
KeyboardInterrupt, but since I already have an excepthook I wanted to
use this.  Any way to make it continue where it was running after the
exception is handled?


def confirm_exit():
while True:
q = raw_input(This will quit the program, are you sure? [y/N])
if q in ('y', 'Y'):
sys.exit(0)
elif q in ('n', 'N'):
print(Continuing execution)
# just go back to normal execution, is it possible??
break


def _exception_handler(etype, value, tb):
if etype == KeyboardInterrupt:
confirm_exit()
else:
sys.exit(1)


def set_exception_handler():
sys.excepthook = _exception_handler
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Testing against multiple versions of Python

2012-10-19 Thread andrea crotti
2012/10/19 Michele Simionato michele.simion...@gmail.com:
 Yesterday I released a new version of the decorator module. It should run 
 under Python 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2, 3.3. I did not have the will 
 to install on my machine 8 different versions of Python, so I just tested it 
 with Python 2.7 and 3.3. But I do not feel happy with that. Is there any kind 
 of service where a package author can send a pre-release version of her 
 package and have its tests run againsts a set of different Python versions?
 I seem to remember somebody talking about a service like that years ago but I 
 don't remembers. I do not see anything on PyPI. Any advice is welcome!

  Michele Simionato


 --
 http://mail.python.org/mailman/listinfo/python-list


Travis on github maybe is what you want?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: locking files on Linux

2012-10-19 Thread andrea crotti
2012/10/18 Oscar Benjamin oscar.j.benja...@gmail.com:

 The lock is cooperative. It does not prevent the file from being
 opened or overwritten. It only prevents any other process from
 obtaining the lock. Here you open the file with mode 'w' which
 truncates the file instantly (without checking for the lock).


 Oscar


Very good thanks now I understood, actually my problem was in the
assumption that it should fail when the lock is already taken, but by
default lockf just blocks until the lock is released.

It seems to work quite nicely so I'm going to use this..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: locking files on Linux

2012-10-18 Thread andrea crotti
2012/10/18 Grant Edwards invalid@invalid.invalid:
 On 2012-10-18, andrea crotti andrea.crott...@gmail.com wrote:


 File locks under Unix have historically been advisory.  That means
 that programs have to _choose_ to pay attention to them.  Most
 programs do not.

 Linux does support mandatory locking, but it's rarely used and must be
 manually enabled at the filesystem level. It's probably worth noting
 that in the Linux kernel docs, the document on mandatory file locking
 begins with a section titled Why you should avoid mandatory locking.

 http://en.wikipedia.org/wiki/File_locking#In_Unix-like_systems
 http://kernel.org/doc/Documentation/filesystems/locks.txt
 http://kernel.org/doc/Documentation/filesystems/mandatory-locking.txt
 http://www.thegeekstuff.com/2012/04/linux-file-locking-types/
 http://www.hackinglinuxexposed.com/articles/20030623.html

 --
 Grant Edwards   grant.b.edwardsYow! Your CHEEKS sit like
   at   twin NECTARINES above
   gmail.coma MOUTH that knows no
BOUNDS --
 --
 http://mail.python.org/mailman/listinfo/python-list


Uhh I see thanks, I guess I'll use the good-old .lock file (even if it
might have some problems too).

Anyway I'm only afraid that my same application could modify the
files, so maybe I can instruct it to check if the file is locked.

Or maybe using sqlite would work even if writing from different
processes?

I would prefer to keep something human readable as INI-format though,
rather then a sqlite file..

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: locking files on Linux

2012-10-18 Thread andrea crotti
2012/10/18 Oscar Benjamin oscar.j.benja...@gmail.com:

 Why not come up with a test that actually shows you if it works? Here
 are two suggestions:

 1) Use time.sleep() so that you know how long the lock is held for.
 2) Write different data into the file from each process and see what
 you end up with.



Ok thanks I will try, but I thought that what I did was the worst
possible case, because I'm opening and writing on the same file from
two different processes, locking the file with LOCK_EX.

It should not open it at all as far as I understood...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generating C++ code

2012-10-10 Thread andrea crotti
2012/10/10 Jean-Michel Pichavant jeanmic...@sequans.com:
 Well, the C++ code will end up running on a MIPS on a SOC, unfortunately, 
 python is not an option here.
 The xml to C++ makes a lot of sense, because only a small part of the code is 
 generated that way (everything related to log  fatal events). Everything 
 else is written directly in C++.

 To answer Andrea's question, the files are regenerated for every compilation 
 (well, unless the xml didn't change, but the xml is highly subject to 
 changes, that's actually its purpose)

 Currently we already have a python script that translate this xml file to 
 C++, but it's done in a way that is difficult to maintain. Basically, when 
 parsing the xml file, it writes the generated C++ code. Something like:
 if 'blabla' in xml:
   h_file.write(#define blabla 55, append=top)
   c_file.write(someglobal = blabla, append=bottom)

 This is working, but the python code is quite difficult to maintain, there's 
 a lot of escaping going on, it's almost impossible to see the structure of 
 the c files unless generating one and hopping it's successful. It's also 
 quite difficult to insert code exactly where you want, because you do not 
 know the order in which the xml trees are defined then parsed.

 I was just wondering if a template engine would help. Maybe not.

 JM
 --
 http://mail.python.org/mailman/listinfo/python-list


I think it depends on what you're writing from the XML, are you
generating just constants (like the #define) or also new classes for
example?

If it's just constants why don't you do a generation from XML - ini
or something similar and then parse it in the C++ properly, then it
would be very easy to do?

You could also parse the XML in the first place but probably that's
harder given your requirements, but I don't think that an ini file
would be a problem, or would it?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generating C++ code

2012-10-09 Thread Andrea Crotti

On 10/09/2012 05:00 PM, Jean-Michel Pichavant wrote:

Greetings,

I'm trying to generate C++ code from an XML file. I'd like to use a template 
engine, which imo produce something readable and maintainable.
My google search about this subject has been quite unsuccessful, I've been 
redirected to template engine specific to html mostly.

Does anybody knows a python template engine for generating C++ code ?

Here's my flow:

XML file - nice python app - C++ code

 From what I know I could use Cheetah, a generic template engine. I never used 
it though, I'm not sure this is what I need.
I'm familiar with jinja2 but I'm not sure I could use it to generate C++ code, 
did anybody try ? (maybe that's a silly question)

Any advice would be appreciated.

JM


I think you can use anything to generate C++ code, but is it a good idea?
Are you going to produce this code only one time and then maintain it 
manually?


And are you sure that the design that you would get from the XML file 
actually makes sense when

translated in C++?
--
http://mail.python.org/mailman/listinfo/python-list


Re: PHP vs. Python

2012-09-25 Thread andrea crotti
2012/9/25  tejas.tank@gmail.com:
 On Thursday, 23 December 2004 03:33:36 UTC+5:30, (unknown)  wrote:
 Anyone know which is faster?  I'm a PHP programmer but considering
 getting into Python ... did searches on Google but didn't turn much up
 on this.

 Thanks!
 Stephen


 Here some helpful gudance.

 http://hentenaar.com/serendipity/index.php?/archives/27-Benchmark-PHP-vs.-Python-vs.-Perl-vs.-Ruby.html
 --
 http://mail.python.org/mailman/listinfo/python-list


Quite ancient versions of everything, would be interesting to see if
things are different now..

Anyway you can switch to Python happily, it might not be faster but
99% of the times that's not an issue..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python presentations

2012-09-24 Thread andrea crotti
For anyone interested, I already moved the slides on github
(https://github.com/AndreaCrotti/pyconuk2012_slides)
and for example the decorator slides will be generated from this:

https://raw.github.com/AndreaCrotti/pyconuk2012_slides/master/deco_context/deco.rst

Notice the literalinclude with :pyobject: which allows to include any
function or class automatically very nicely from external files ;)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 'str' object does not support item assignment

2012-09-23 Thread Andrea Crotti

On 09/23/2012 07:31 PM, jimbo1qaz wrote:

spots[y][x]=mark fails with a 'str' object does not support item assignment 
error,even though:

a=[[a]]
a[0][0]=b

and:

a=[[a]]
a[0][0]=100

both work.
Spots is a nested list created as a copy of another list.


But
a = a
a[0] = 'c'
fails for the same reason, which is that strings in Python are immutable..
--
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess call is not waiting.

2012-09-19 Thread andrea crotti
2012/9/18 Dennis Lee Bieber wlfr...@ix.netcom.com:

 Unless you have a really massive result set from that ls, that
 command probably ran so fast that it is blocked waiting for someone to
 read the PIPE.

I tried also with ls -lR / and that definitively takes a while to run,
when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Anyway I also need to know when the process is over while waiting, so
probably a thread is the only way..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess call is not waiting.

2012-09-19 Thread andrea crotti
2012/9/19 Hans Mulder han...@xs4all.nl:
 Yes: using top is an observation problem.

 Top, as the name suggests, shows only the most active processes.

Sure but ls -lR / is a very active process if you try to run it..
Anyway as written below I don't need this anymore.


 It's quite possible that your 'ls' process is not active, because
 it's waiting for your Python process to read some data from the pipe.

 Try using ps instead.  Look in thte man page for the correct
 options (they differ between platforms).  The default options do
 not show all processes, so they may not show the process you're
 looking for.

 Anyway I also need to know when the process is over while waiting, so
 probably a thread is the only way..

 This sounds confused.

 You don't need threads.  When 'ls' finishes, you'll read end-of-file
 on the proc.stdout pipe.  You should then call proc.wait() to reap
 its exit status (if you don't, you'll leave a zombie process).
 Since the process has already finished, the proc.wait() call will
 not actually do any waiting.


 Hope this helps,



Well there is a process which has to do two things, monitor
periodically some external conditions (filesystem / db), and launch a
process that can take very long time.

So I can't put a wait anywhere, or I'll stop everything else.  But at
the same time I need to know when the process is finished, which I
could do but without a wait might get hacky.

So I'm quite sure I just need to run the subprocess in a subthread
unless I'm missing something obvious..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python presentations

2012-09-19 Thread andrea crotti
2012/9/19 Trent Nelson tr...@snakebite.org:

 FWIW, I gave a presentation on decorators to the New York Python
 User Group back in 2008.  Relevant blog post:

 http://blogs.onresolve.com/?p=48

 There's a link to the PowerPoint presentation I used in the first
 paragraph.  It's in .pptx format; let me know if you'd like it in
 some other form.

 Regards,

 Trent.


Ok thanks a lot, how long did it take for you to present that material?

Interesting the part about the learning process, I had a similar
experience, but probably skip this since I only have 30 minutes.

Another thing which I would skip or only explain how it works are
parametrized decorators, in the triple-def form they just look to ugly
to be worth the effort (but at least should be understood).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess call is not waiting.

2012-09-18 Thread andrea crotti
I have a similar problem, something which I've never quite understood
about subprocess...
Suppose I do this:

proc = subprocess.Popen(['ls', '-lR'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

now I created a process, which has a PID, but it's not running apparently...
It only seems to run when I actually do the wait.

I don't want to make it waiting, so an easy solution is just to use a
thread, but is there a way with subprocess?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Decorators not worth the effort

2012-09-14 Thread andrea crotti
I think one very nice and simple example of how decorators can be used is this:

def memoize(f, cache={}, *args, **kwargs):
def _memoize(*args, **kwargs):
key = (args, str(kwargs))
if not key in cache:
cache[key] = f(*args, **kwargs)

return cache[key]

return _memoize

def fib(n):
if n = 1:
return 1
return fib(n-1) + fib(n-2)

@memoize
def fib_memoized(n):
if n = 1:
return 1
return fib_memoized(n-1) + fib_memoized(n-2)


The second fibonacci looks exactly the same but while the first is
very slow and would generate a stack overflow the second doesn't..

I might use this example for the presentation, before explaining what it is..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Decorators not worth the effort

2012-09-14 Thread andrea crotti
2012/9/14 Chris Angelico ros...@gmail.com:

 Trouble is, you're starting with a pretty poor algorithm. It's easy to
 improve on what's poor. Memoization can still help, but I would start
 with a better algorithm, such as:

 def fib(n):
 if n=1: return 1
 a,b=1,1
 for i in range(1,n,2):
 a+=b
 b+=a
 return b if n%2 else a

 def fib(n,cache=[1,1]):
 if n=1: return 1
 while len(cache)=n:
 cache.append(cache[-1] + cache[-2])
 return cache[n]

 Personally, I don't mind (ab)using default arguments for caching, but
 you could do the same sort of thing with a decorator if you prefer. I
 think the non-decorated non-recursive version is clear and efficient
 though.

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list


The poor algorithm is much more close to the mathematical definition
than the smarter iterative one..  And in your second version you
include some ugly caching logic inside it, so why not using a
decorator then?

I'm not saying that with the memoization is the good solution, just
that I think it's a very nice example of how to use a decorator, and
maybe a good example to start with a talk on decorators..
-- 
http://mail.python.org/mailman/listinfo/python-list


main and dependent objects

2012-09-13 Thread andrea crotti
I am in a situation where I have a class Obj which contains many
attributes, and also contains logically another object of class
Dependent.

This dependent_object, however, also needs to access many fields of the
original class, so at the moment we did something like this:


class Dependent:
def __init__(self, orig):
self.orig = orig

def using_other_attributes(self):
print(Using attr1, self.orig.attr1)


class Obj:
def __init__(self):
self.attr1 = attr1
self.attr2 = attr2
self.attr3 = attr3

self.dependent_object = Dependent(self)


But I'm not so sure it's a good idea, it's a bit smelly..
Any other suggestion about how to get a similar result?

I could of course passing all the arguments needed to the constructor of
Dependent, but it's a bit tedious..


Thanks,
Andrea
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: main and dependent objects

2012-09-13 Thread andrea crotti
2012/9/13 Jean-Michel Pichavant jeanmic...@sequans.com:

 Nothing shocking right here imo. It looks like a classic parent-child 
 implementation.
 However it seems the relation between Obj and Dependent are 1-to-1. Since 
 Dependent need to access all Obj attributes, are you sure that Dependent and 
 Obj are not actually the same class ?


 JM

Yes well the main class is already big enough, and the relation is 1-1
but the dependent class can be also considered separate to split
things more nicely..

So I think it will stay like this for now and see how it goes.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python presentations

2012-09-13 Thread andrea crotti
2012/9/13 William R. Wing (Bill Wing) w...@mac.com:

 [byte]

 Speaking from experience as both a presenter and an audience member, please 
 be sure that anything you demo interactively you include in your slide deck 
 (even if only as an addendum).  I assume your audience will have access to 
 the deck after your talk (on-line or via hand-outs), and you want them to be 
 able to go home and try it out for themselves.

 Nothing is more frustrating than trying to duplicate something you saw a 
 speaker do, and fail because of some detail you didn't notice at the time of 
 the talk.  A good example is one that was discussed on the matplotlib-users 
 list several weeks ago:

 http://www.loria.fr/~rougier/teaching/matplotlib/

 -Bill


Yes that's a good point thanks, in general everything is already in a
git repository, now only in my dropbox but later I will make it
public.

Even the code that I should write there should already written anyway,
and to make sure everything is available I could use the save function
of IPython and add it to the repository...

In general I think that explaining code on a slide (if it involves
some new concepts in particular) it's better, but then showing what it
does it's always a plus.

It's not the same if you say this will go 10x faster than the previous
one, and showing that it actually does on your machine..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python presentations

2012-09-13 Thread Andrea Crotti

On 09/13/2012 11:58 PM, Miki Tebeka wrote:

What do you think work best in general?

I find typing during class (other than small REPL examples) time consuming and 
error prone.

What works well for me is to create a slidy HTML presentation with asciidoc, 
then I can include code snippets that can be also run from the command line.
(Something like:

 [source,python,numbered]
 ---
 include::src/sin.py[]
 ---

Output example: http://i.imgur.com/Aw9oQ.png
)

Let me know if you're interested and I'll send you a example project.

HTH,
--
Miki


Yes please send me something and I'll have a look.
For my slides I'm using hieroglyph:
http://heiroglyph.readthedocs.org/en/latest/index.html

which works with sphinx, so in theory I might be able to run the code as 
well..


But in general probably the best way is to copy and paste in a ipython 
session, to show

that what I just explained actually works as expected..
--
http://mail.python.org/mailman/listinfo/python-list


Re: pyQT performance?

2012-09-10 Thread Andrea Crotti

On 09/10/2012 07:29 PM, jayden.s...@gmail.com wrote

Have you ever used py2exe? After converting the python codes to executable, 
does it save the time of interpreting the script language? Thank a lot!


Py2exe normally never speeds up anything, simply because it doesn't 
convert to executable, but simply
package everything together, so I haven't tried in this particular case 
but it shouldn't make a difference..

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python remember the initial directory?

2012-08-20 Thread andrea crotti
2012/8/20 kj no.em...@please.post:
 In roy-ca6d77.17031119082...@news.panix.com Roy Smith r...@panix.com 
 writes This means that no library code can ever count on, for example,
 being able to reliably find the path to the file that contains the
 definition of __main__.  That's a weakness, IMO.  One manifestation
 of this weakness is that os.chdir breaks inspect.getmodule, at
 least on Unix.  If you have some Unix system handy, you can try
 the following.  First change the argument to os.chdir below to some
 valid directory other than your working directory.  Then, run the
 script, making sure that you refer to it using a relative path.
 When I do this on my system (OS X + Python 2.7.3), the script bombs
 at the last print statement, because the second call to inspect.getmodule
 (though not the first one) returns None.

 import inspect
 import os

 frame = inspect.currentframe()

 print inspect.getmodule(frame).__name__

 os.chdir('/some/other/directory') # where '/some/other/directory' is
   # different from the initial directory

 print inspect.getmodule(frame).__name__

 ...

 % python demo.py
 python demo.py
 __main__
 Traceback (most recent call last):
   File demo.py, line 11, in module
 print inspect.getmodule(frame).__name__
 AttributeError: 'NoneType' object has no attribute '__name__'

..

As in many other cases the programming language can't possibly act
safely on all the possible stupid things that the programmer wants to
do, and not understanding how an operating system works doesn't help
either..

In the specific case there is absolutely no use of os.chdir, since you
can:
- use absolute paths
- things like subprocess.Popen accept a cwd argument
- at worst you can chdir back to the previous position right after the
broken thing that require a certain path that you are calling is run
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python remember the initial directory?

2012-08-20 Thread andrea crotti
2012/8/20 Roy Smith r...@panix.com:
 In article k0tf8g$adc$1...@news.albasani.net,
  Walter Hurry walterhu...@lavabit.com wrote:

 It is difficult to think of a sensible use for os.chdir, IMHO.

 It is true that you can mostly avoid chdir() by building absolute
 pathnames, but it's often more convenient to just cd somewhere and use
 names relative to that.  Fabric (a very cool tool for writing remote
 sysadmin scripts), gives you a cd() command which is a context manager,
 making it extra convenient.

 Also, core files get created in the current directory.  Sometimes
 daemons will cd to some fixed location to make sure that if they dump
 core, it goes in the right place.

 On occasion, you run into (poorly designed, IMHO) utilities which insist
 of reading or writing a file in the current directory.  If you're
 invoking one of those, you may have no choice but to chdir() to the
 right place before running them.
 --
 http://mail.python.org/mailman/listinfo/python-list


I've done quite a lot of system programming as well, and changing
directory is only a source of possible troubles in general.

If I really have to for some reasons I do this


class TempCd:
Change temporarily the current directory

def __init__(self, newcwd):
self.newcwd = newcwd
self.oldcwd = getcwd()

def __enter__(self):
chdir(self.newcwd)
return self

def __exit__(self, type, value, traceback):
chdir(self.oldcwd)


with TempCd('/tmp'):
# now working in /tmp

# now in the original

So it's not that hard to avoid problems..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-16 Thread andrea crotti
2012/8/16 Jean-Michel Pichavant jeanmic...@sequans.com:

 SVN allows to define external dependencies, where one repository will
 actually checkout another one at a specific version. If SVN does it, I guess
 any decent SCM also provide such feature.

 Assuming our project is named 'common', and you have 2 projects A and B :

 A
- common@rev1

 B
- common@rev2

 Project A references the lib as A.common, B as B.common. You need to be
 extra carefull to never reference common as 'common' in any place.

 JM



Unfortunately I think you guess wrong
http://forums.perforce.com/index.php?/topic/553-perforce-svnexternals-equivalent/
Anyway with views and similar things is not that hard to implement the
same thing..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to call perl script from html using python

2012-08-16 Thread andrea crotti
2012/8/16 Pervez Mulla mullaper...@gmail.com:

 Hey Steven ,

 Thank you for your response,

 I will in detail now about my project,

 Actually the project entire backend in PERL language , Am using Django 
 framework for my front end .

 I have written code for signup page in python , which is working perfectly .

 In HTml when user submit POST method, it calling Python code Instead of 
 this I wanna call perl script for sign up ..

 below in form for sign up page in python 

Good that's finally an explanation, so the question you can ask google
was how do I call an external process from Python,
which has absolutely nothing to with HTML, and is very easy to find
out (hint: subprocess).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-16 Thread andrea crotti
2012/8/16 andrea crotti andrea.crott...@gmail.com:


 Unfortunately I think you guess wrong
 http://forums.perforce.com/index.php?/topic/553-perforce-svnexternals-equivalent/
 Anyway with views and similar things is not that hard to implement the
 same thing..


I'm very happy to say that I finally made it!

It took 3 hours to move / merge a few thousand lines around but
everything seems to work perfectly now..

At the moment I'm just using symlinks, I'll see later if something
smarter is necessary, thanks to everyone for the ideas.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-15 Thread andrea crotti
2012/8/14 Cameron Simpson c...@zip.com.au:

 Having just skimmed this thread, one thing I haven't quite seen suggested is
 this:

 Really do make a third utilities project, and treat the project and
 deploy as separate notions. So to actually run/deploy project A's code
 you'd have a short script that copied project A and the utilities project
 code into a tree and ran off that. Or even a simple process/script to
 update the copy of utilities in project A's area.

 So you don't share code on an even handed basis but import the
 utilities library into each project as needed.

 I do this (one my own very small scale) in one of two ways:

   - as needed, copy the desired revision of utilities into the project's
 library space and do perforce's equivalent of Mercurial's addremove
 on that library tree (comment update utilities to revision X).

   - keep a perforce work area for the utilities in your project A area,
 where your working project A can hook into it with a symlink or some
 deploy/copy procedure as suggested above.
 With this latter one you can push back into the utilities library
 from your live project, because you have a real checkout. So:

   projectAdir
 projectA-perforce-checkout
 utilities-perforce-checkout
   projectBdir
 projectB-perforce-checkout
 utilities-perforce-checkout


Thanks, is more or less what I was going to do..  But I would not use
symlinks and similar things, because then every user should set it up
accordingly.

Potentially we could instead use the perforce API to change the
workspace mappings at run-time, and thus force perforce to checkout
the files in the right place..

There is still the problem that people should checkout things from two
places all the time instead of one..

 Personally I become more and more resistent to cut/paste even for small
 things as soon as multiple people use it; you will never get to backport
 updates to even trivial code to all the copies.

 Cheers,


Well sure, but on the other end as soon as multiple people use it you
can't change any of the public functions signatures without being
afraid that you'll break something..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-15 Thread andrea crotti
Also looking at logilab-common I thought that it would be great if we
could actually make this common library even open source, and use it
as one of the other many external libraries.

Since Python code is definitively not the the core business of this
company I might even convince them, but the problem is that then all
the internal people working on it would not be able to use the
standard tools that they use with everything else..

Did anyone manage to convince his company to do something similar?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-14 Thread andrea crotti
2012/8/13 Rob Day robert@merton.oxon.org:
 I'd just create a module - called shared_utils.py or similar - and import
 that in both projects. It might be a bit messy if there's no 'unifying
 theme' to the module - but surely it'd be a lot less messy than your
 TempDirectory class, and anyone else who knows Python will understand
 'import shared_utils' much more easily.

 I realise you might not want to say, but if you could give some idea what
 sort of projects these are, and what sorts of code you're trying to share,
 it might make things a bit clearer.

 I'm not really sure what your concerns about 'versioning and how to link
 different pieces together' are - what d you think could go wrong here?


It's actually not so simple..

Because the two projects live in different parts of the repository
with different people allowed to work on them, and they have to run on
different machines..

In plus I'm using perforce which doesn't have any svn:externals-like
thing as far as I know..  The thing I should do probably is to set up
workspace (which contains *absolute* paths of the machines) with the
right setting to make module available in the right position.

Second problem is that one of the two projects has a quite insane
requirement, which is to be able to re-run itself on a specific
version depending on a value fetched from the database.

This becomes harder if divide code around, but in theory I can use the
changeset number which is like a SVN revision so this should be fine.

The third problem is that from the moment is not just me using these
things, how can I be sure that changing something will not break
someone else code?

I have unit tests on both projects plus the tests for the utils, but
as soon as I separate them it becomes harder to test everything..

So well everything can have a solution probably, I just hope it's
worth the effort..

Another thing which would be quite cool might be a import hook which
fetches things from the repository when needed, with a simple
bootstrap script for every project to be able to use this feature, but
it only makes sense if I need this kind of feature in many projects.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sharing code between different projects?

2012-08-14 Thread andrea crotti
2012/8/14 Jean-Michel Pichavant jeanmic...@sequans.com:

 I can think of logilab-common (http://www.logilab.org/848/)

 Having a company-wide python module properly distributed is one to achieve
 your goal. Without distributing your module to the public, there's a way to
 have a pypi-like server runnning on your private network :

 http://pypi.python.org/pypi/pypiserver/

 JM

 Note : looks like pypi.python.org is having some trouble, the above link is
 broken. Search for recent announcement about pypiserver.



Thanks, yes we need something like this..
I'll copy the name probably, I prefer common to utils/utilities..
-- 
http://mail.python.org/mailman/listinfo/python-list


Sharing code between different projects?

2012-08-13 Thread andrea crotti
I am in the situation where I am working on different projects that
might potentially share a lot of code.

I started to work on project A, then switched completely to project B
and in the transiction I copied over a lot of code with the
corresponding tests, and I started to modify it.

Now it's time to work again on project A, but I don't want to copy
things over again.

I would like to design a simple and nice way to share between projects,
where the things I want to share are simple but useful things as for
example:

class TempDirectory:
Create a temporary directory and cd to it on enter, cd back to
the original position and remove it on exit

def __init__(self):
self.oldcwd = getcwd()
self.temp_dir = mkdtemp()

def __enter__(self):
logger.debug(create and move to temp directory %s % self.temp_dir)
return self.temp_dir

def __exit__(self, type, value, traceback):
# I first have to move out
chdir(self.oldcwd)
logger.debug(removing the temporary directory and go back to
the original position %s % self.temp_dir)
rmtree(self.temp_dir)


The problem is that there are functions/classes from many domains, so it
would not make much sense to create a real project, and the only name I
could give might be utils or utilities..

In plus the moment the code is shared I must take care of versioning and
how to link different pieces together (we use perforce by the way).

If then someone else except me will want to use these functions then of
course I'll have to be extra careful, designing really good API's and so
on, so I'm wondering where I should set the trade-off between ability to
share and burden to maintain..

Anyone has suggestions/real world experiences about this?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti
2012/8/1 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 When you start using threads, you have to expect these sorts of
 intermittent bugs unless you are very careful.

 My guess is that you have a bug where two threads read from the same file
 at the same time. Since each read shares state (the position of the file
 pointer), you're going to get corruption. Because it depends on timing
 details of which threads do what at exactly which microsecond, the effect
 might as well be random.

 Example: suppose the file contains three blocks A B and C, and a
 checksum. Thread 8 starts reading the file, and gets block A and B. Then
 thread 2 starts reading it as well, and gets half of block C. Thread 8
 gets the rest of block C, calculates the checksum, and it doesn't match.

 I recommend that you run a file system check on the remote disk. If it
 passes, you can eliminate file system corruption. Also, run some network
 diagnostics, to eliminate corruption introduced in the network layer. But
 I expect that you won't find anything there, and the problem is a simple
 thread bug. Simple, but really, really hard to find.

 Good luck.

One last thing I would like to do before I add this fix is to actually
be able to reproduce this behaviour, and I thought I could just do the
following:

import gzip
import threading


class OpenAndRead(threading.Thread):
def run(self):
fz = gzip.open('out2.txt.gz')
fz.read()
fz.close()


if __name__ == '__main__':
for i in range(100):
OpenAndRead().start()


But no matter how many threads I start, I can't reproduce the CRC
error, any idea how I can try to help it happening?

The code in run should be shared by all the threads since there are no
locks, right?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti
2012/8/2 Laszlo Nagy gand...@shopzeus.com:

 Your example did not share the file object between threads. Here an example
 that does that:

 class OpenAndRead(threading.Thread):
 def run(self):
 global fz
 fz.read(100)

 if __name__ == '__main__':

fz = gzip.open('out2.txt.gz')
for i in range(10):
 OpenAndRead().start()

 Try this with a huge file. And here is the one that should never throw CRC
 error, because the file object is protected by a lock:

 class OpenAndRead(threading.Thread):
 def run(self):
 global fz
 global fl
 with fl:
 fz.read(100)

 if __name__ == '__main__':

fz = gzip.open('out2.txt.gz')
fl = threading.Lock()
for i in range(2):
 OpenAndRead().start()



 The code in run should be shared by all the threads since there are no
 locks, right?

 The code is shared but the file object is not. In your example, a new file
 object is created, every time a thread is started.



Ok sure that makes sense, but then this explanation is maybe not right
anymore, because I'm quite sure that the file object is *not* shared
between threads, everything happens inside a thread..

I managed to get some errors doing this with a big file
class OpenAndRead(threading.Thread):
 def run(self):
 global fz
 fz.read(100)

if __name__ == '__main__':

fz = gzip.open('bigfile.avi.gz')
for i in range(20):
 OpenAndRead().start()

and it doesn't fail without the *global*, but this is definitively not
what the code does, because every thread gets a new file object, it's
not shared..

Anyway we'll read once for all the threads or add the lock, and
hopefully it should solve the problem, even if I'm not convinced yet
that it was this.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti
2012/8/2 andrea crotti andrea.crott...@gmail.com:

 Ok sure that makes sense, but then this explanation is maybe not right
 anymore, because I'm quite sure that the file object is *not* shared
 between threads, everything happens inside a thread..

 I managed to get some errors doing this with a big file
 class OpenAndRead(threading.Thread):
  def run(self):
  global fz
  fz.read(100)

 if __name__ == '__main__':

 fz = gzip.open('bigfile.avi.gz')
 for i in range(20):
  OpenAndRead().start()

 and it doesn't fail without the *global*, but this is definitively not
 what the code does, because every thread gets a new file object, it's
 not shared..

 Anyway we'll read once for all the threads or add the lock, and
 hopefully it should solve the problem, even if I'm not convinced yet
 that it was this.


Just for completeness as suggested this also does not fail:

class OpenAndRead(threading.Thread):
def __init__(self, lock):
threading.Thread.__init__(self)
self.lock = lock

def run(self):
 global fz
 with self.lock:
 fz.read(100)

if __name__ == '__main__':
lock = threading.Lock()
fz = gzip.open('bigfile.avi.gz')
for i in range(20):
 OpenAndRead(lock).start()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:
 I was just surprised that it worked better than I expected even
 without Pipes and Queues, but now I understand why..

 Anyway now I would like to be able to detach subprocesses to avoid the
 nasty code reloading that I was talking about in another thread, but
 things get more tricky, because I can't use queues and pipes to
 communicate with a running process that it's noit my child, correct?

 Yes, I think that is correct. Instead of detaching a child process, you can
 create independent processes and use other frameworks for IPC. For example,
 Pyro.  It is not as effective as multiprocessing.Queue, but in return, you
 will have the option to run your service across multiple servers.

 The most effective IPC is usually through shared memory. But there is no OS
 independent standard Python module that can communicate over shared memory.
 Except multiprocessing of course, but AFAIK it can only be used to
 communicate between fork()-ed processes.


Thanks, there is another thing which is able to interact with running
processes in theory:
https://github.com/lmacken/pyrasite

I don't know though if it's a good idea to use a similar approach for
production code, as far as I understood it uses gdb..  In theory
though I could be able to set up every subprocess with all the data
they need, so I might not even need to share data between them.

Anyway now I had another idea to avoid to be able to stop the main
process without killing the subprocesses, using multiple forks.  Does
the following makes sense?  I don't really need these subprocesses to
be daemons since they should quit when done, but is there anything
that can go wrong with this approach?

from os import fork
from time import sleep
from itertools import count
from sys import exit

from multiprocessing import Process, Queue

class LongProcess(Process):
def __init__(self, idx, queue):
Process.__init__(self)
# self.daemon = True
self.queue = queue
self.idx = idx

def run(self):
for i in count():
self.queue.put(%d: %d  % (self.idx, i))
print(adding %d: %d  % (self.idx, i))
sleep(2)


if __name__ == '__main__':
qu = Queue()

# how do I do a multiple fork?
for i in range(5):
pid = fork()
# if I create here all the data structures I should still be
able to do things
if pid == 0:
lp = LongProcess(1, qu)
lp.start()
lp.join()
exit(0)
else:
print(started subprocess with pid , pid)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:
 On thing is sure: os.fork() doesn't work under Microsoft Windows. Under
 Unix, I'm not sure if os.fork() can be mixed with
 multiprocessing.Process.start(). I could not find official documentation on
 that.  This must be tested on your actual platform. And don't forget to use
 Queue.get() in your test. :-)


Yes I know we don't care about Windows for this particular project..
I think mixing multiprocessing and fork should not harm, but probably
is unnecessary since I'm already in another process after the fork so
I can just make it run what I want.

Otherwise is there a way to do same thing only using multiprocessing?
(running a process that is detachable from the process that created it)
-- 
http://mail.python.org/mailman/listinfo/python-list


CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
We're having some really obscure problems with gzip.
There is a program running with python2.7 on a 2.6.18-128.el5xen (red
hat I think) kernel.

Now this program does the following:
if filename == 'out2.txt':
 out2 = open('out2.txt')
elif filename == 'out2.txt.gz'
 out2 = open('out2.txt.gz')

text = out2.read()

out2.close()

very simple right? But sometimes we get a checksum error.
Reading the code I got the following:

 - CRC is at the end of the file and is computed against the whole
file (last 8 bytes)
 - after the CRC there is the \ marker for the EOF
 - readline() doesn't trigger the checksum generation in the
beginning, but only when the EOF is reached
 - until a file is flushed or closed you can't read the new content in it

but the problem is that we can't reproduce it, because doing it
manually on the same files it works perfectly,
and the same files some time work some time don't work.

The files are on a shared NFS drive, I'm starting to think that it's a
network/fs problem, which might truncate the file
adding an EOF before the end and thus making the checksum fail..
But is it possible?
Or what else could it be?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:
 On 2012-08-01 12:39, andrea crotti wrote:

 We're having some really obscure problems with gzip.
 There is a program running with python2.7 on a 2.6.18-128.el5xen (red
 hat I think) kernel.

 Now this program does the following:
 if filename == 'out2.txt':
   out2 = open('out2.txt')
 elif filename == 'out2.txt.gz'
   out2 = open('out2.txt.gz')

 Gzip file is binary. You should open it in binary mode.

 out2 = open('out2.txt.gz',b)

 Otherwise carriage return and newline characters will be converted
 (depending on the platform).


 --
 http://mail.python.org/mailman/listinfo/python-list


Ah no sorry I just wrote wrong that part of the code, it was
otu2 = gzip.open('out2.txt.gz') because otherwise nothing would possibly work..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
Full traceback:

Exception in thread Thread-8:
Traceback (most recent call last):
  File /user/sim/python/lib/python2.7/threading.py, line 530, in
__bootstrap_inner
self.run()
  File /user/sim/tests/llif/AutoTester/src/AutoTester2.py, line 67, in run
self.processJobData(jobData, logger)
  File /user/sim/tests/llif/AutoTester/src/AutoTester2.py, line 204,
in processJobData
self.run_simulator(area, jobData[1] ,log)
  File /user/sim/tests/llif/AutoTester/src/AutoTester2.py, line 142,
in run_simulator
report_file, percentage, body_text = SimResults.copy_test_batch(log, area)
  File /user/sim/tests/llif/AutoTester/src/SimResults.py, line 274,
in copy_test_batch
out2_lines = out2.read()
  File /user/sim/python/lib/python2.7/gzip.py, line 245, in read
self._read(readsize)
  File /user/sim/python/lib/python2.7/gzip.py, line 316, in _read
self._read_eof()
  File /user/sim/python/lib/python2.7/gzip.py, line 338, in _read_eof
hex(self.crc)))
IOError: CRC check failed 0x4f675fba != 0xa9e45aL


- The file is written with the linux gzip program.
- no I can't reproduce the error with the same exact file that did
failed, that's what is really puzzling,
  there seems to be no clear pattern and just randmoly fails. The file
is also just open for read from this program,
  so in theory no way that it can be corrupted.

  I also checked with lsof if there are processes that opened it but
nothing appears..

- can't really try on the local disk, might take ages unfortunately
(we are rewriting this system from scratch anyway)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti
2012/8/1 Roy Smith r...@panix.com:
 In article mailman.2809.1343809166.4697.python-l...@python.org,
  Laszlo Nagy gand...@shopzeus.com wrote:

 Yes, I think that is correct. Instead of detaching a child process, you
 can create independent processes and use other frameworks for IPC. For
 example, Pyro.  It is not as effective as multiprocessing.Queue, but in
 return, you will have the option to run your service across multiple
 servers.

 You might want to look at beanstalk (http://kr.github.com/beanstalkd/).
 We've been using it in production for the better part of two years.  At
 a 30,000 foot level, it's an implementation of queues over named pipes
 over TCP, but it takes care of a zillion little details for you.

 Setup is trivial, and there's clients for all sorts of languages.  For a
 Python client, go with beanstalkc (pybeanstalk appears to be
 abandonware).

 The most effective IPC is usually through shared memory. But there is no
 OS independent standard Python module that can communicate over shared
 memory.

 It's true that shared memory is faster than serializing objects over a
 TCP connection.  On the other hand, it's hard to imagine anything
 written in Python where you would notice the difference.
 --
 http://mail.python.org/mailman/listinfo/python-list


That does look nice and I would like to have something like that..
But since I have to convince my boss of another external dependency I
think it might be worth
to try out zeromq instead, which can also do similar things and looks
more powerful, what do you think?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:
there seems to be no clear pattern and just randmoly fails. The file
 is also just open for read from this program,
so in theory no way that it can be corrupted.

 Yes, there is. Gzip stores CRC for compressed *blocks*. So if the file is
 not flushed to the disk, then you can only read a fragment of the block, and
 that changes the CRC.


I also checked with lsof if there are processes that opened it but
 nothing appears..

 lsof doesn't work very well over nfs. You can have other processes on
 different computers (!) writting the file. lsof only lists the processes on
 the system it is executed on.


 - can't really try on the local disk, might take ages unfortunately
 (we are rewriting this system from scratch anyway)




Thanks a lotl, someone that writes on the file while reading might be
an explanation, the problem is that everyone claims that they are only
reading the file.

Apparently this file is generated once and a long time after only read
by two different tools (in sequence), so this could not be possible
either in theory.. I'll try to investigate more in this sense since
it's the only reasonable explation I've found so far.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:

 So detaching the child process will not make IPC stop working. But exiting
 from the original parent process will. (And why else would you detach the
 child?)

 --
 http://mail.python.org/mailman/listinfo/python-list


Well it makes perfect sense if it stops working to me, so or
- I use zeromq or something similar to communicate
- I make every process independent without the need to further
communicate with the parent..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
2012/8/1 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
 On Wed, 01 Aug 2012 14:01:45 +0100, andrea crotti wrote:

 Full traceback:

 Exception in thread Thread-8:

 DANGER DANGER DANGER WILL ROBINSON!!!

 Why didn't you say that there were threads involved? That puts a
 completely different perspective on the problem.

 I *was* going to write back and say that you probably had either file
 system corruption, or network errors. But now that I can see that you
 have threads, I will revise that and say that you probably have a bug in
 your thread handling code.

 I must say, Andrea, your initial post asking for help was EXTREMELY
 misleading. You over-simplified the problem to the point that it no
 longer has any connection to the reality of the code you are running.
 Please don't send us on wild goose chases after bugs in code that you
 aren't actually running.


   there seems to be no clear pattern and just randmoly fails.

 When you start using threads, you have to expect these sorts of
 intermittent bugs unless you are very careful.

 My guess is that you have a bug where two threads read from the same file
 at the same time. Since each read shares state (the position of the file
 pointer), you're going to get corruption. Because it depends on timing
 details of which threads do what at exactly which microsecond, the effect
 might as well be random.

 Example: suppose the file contains three blocks A B and C, and a
 checksum. Thread 8 starts reading the file, and gets block A and B. Then
 thread 2 starts reading it as well, and gets half of block C. Thread 8
 gets the rest of block C, calculates the checksum, and it doesn't match.

 I recommend that you run a file system check on the remote disk. If it
 passes, you can eliminate file system corruption. Also, run some network
 diagnostics, to eliminate corruption introduced in the network layer. But
 I expect that you won't find anything there, and the problem is a simple
 thread bug. Simple, but really, really hard to find.

 Good luck.


Thanks a lot, that makes a lot of sense..  I haven't given this detail
before because I didn't write this code, and I forgot that there were
threads involved completely, I'm just trying to help to fix this bug.

Your explanation makes a lot of sense, but it's still surprising that
even just reading files without ever writing them can cause troubles
using threads :/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti
2012/8/1 Laszlo Nagy gand...@shopzeus.com:

 Thanks a lot, that makes a lot of sense..  I haven't given this detail
 before because I didn't write this code, and I forgot that there were
 threads involved completely, I'm just trying to help to fix this bug.

 Your explanation makes a lot of sense, but it's still surprising that
 even just reading files without ever writing them can cause troubles
 using threads :/

 Make sure that file objects are not shared between threads. If that is
 possible. It will probably solve the problem (if that is related to
 threads).


Well I just have to create a lock I guess right?
with lock:
# open file
# read content
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-07-31 Thread andrea crotti


 def procs():
 mp = MyProcess()
 # with the join we are actually waiting for the end of the running time
 mp.add([1,2,3])
 mp.start()
 mp.add([2,3,4])
 mp.join()
 print(mp)


I think I got it now, if I already just mix the start before another
add, inside the Process.run it won't see the new data that has been
added after the start.

So this way is perfectly safe only until the process is launched, if
it's running I need to use some multiprocess-aware data structure, is
that correct?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pass data to a subprocess

2012-07-31 Thread andrea crotti
2012/7/31 Laszlo Nagy gand...@shopzeus.com:
 I think I got it now, if I already just mix the start before another add,
 inside the Process.run it won't see the new data that has been added after
 the start. So this way is perfectly safe only until the process is launched,
 if it's running I need to use some multiprocess-aware data structure, is
 that correct?

 Yes. Read this:

 http://docs.python.org/library/multiprocessing.html#exchanging-objects-between-processes

 You can use Queues and Pipes. Actually, these are basic elements of the
 multiprocessing module and they are well documented. I wonder if you read
 the documentation at all, before posting questions here.


 --
 http://mail.python.org/mailman/listinfo/python-list


As I wrote I found many nice things (Pipe, Manager and so on), but
actually even
this seems to work: yes I did read the documentation.

I was just surprised that it worked better than I expected even
without Pipes and Queues, but now I understand why..

Anyway now I would like to be able to detach subprocesses to avoid the
nasty code reloading that I was talking about in another thread, but
things get more tricky, because I can't use queues and pipes to
communicate with a running process that it's noit my child, correct?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: py2c - an open source Python to C/C++ is looking for developers

2012-07-30 Thread andrea crotti
2012/7/30  maniandra...@gmail.com:
 I created py2c ( http://code.google.com/p/py2c )- an open source Python to 
 C/C++ translator!
 py2c is looking for developers!
 To join create a posting in the py2c-discuss Google Group or email me!
 Thanks
 PS:I hope this is the appropiate group for this message.
 --
 http://mail.python.org/mailman/listinfo/python-list

It looks like a very very hard task, and really useful or for exercise?

The first few lines I've seen there are the dangerous * imports and
LazyStrin looks like a typo..

from ast import *
import functools
from c_types import *
from lazystring import *
#constant data
empty = LazyStrin
ordertuple = ((Or,),(And
-- 
http://mail.python.org/mailman/listinfo/python-list


regexps to objects

2012-07-27 Thread andrea crotti
I have some complex input to parse (with regexps), and I would like to
create nice objects directy from them.
The re module doesn't of course try to conver to any type, so I was
playing around to see if it's worth do something as below, where I
assign a constructor to every regexp and build an object from the
result..

Do you think it makes sense in general or how do you cope with this problem?

import re
from time import strptime
TIME_FORMAT_INPUT = '%m/%d/%Y %H:%M:%S'

def time_string_to_obj(timestring):
return strptime(timestring, TIME_FORMAT_INPUT)


REGEXPS = {
'num': ('\d+', int),
'date': ('[0-9/]+ [0-9:]+', time_string_to_obj),
}


def reg_to_obj(reg, st):
reg, constr = reg
found = re.match(reg, st)
return constr(found.group())


if __name__ == '__main__':
print reg_to_obj(REGEXPS['num'], '100')
print reg_to_obj(REGEXPS['date'], '07/24/2012 06:23:13')
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reloading code and multiprocessing

2012-07-27 Thread andrea crotti
2012/7/25 andrea crotti andrea.crott...@gmail.com:

 I would also like to avoid this in general, but we have many
 subprocesses to launch and some of them might take weeks, so we need
 to have a process which is always running, because there is never a
 point in time where we can just say let's stop everything and start again..

 Anyway if there are better solutions I'm still glad to hear them, but
 I would also like to keep it simple..

 Another thing which now we need to figure out is how to communicate
 with the live process..  For example we might want to submit something
 manually, which should pass from the main process.

 The first idea is to have a separate process that opens a socket and
 listens for data on a local port, with a defined protocol.

 Then the main process can parse these commands and run them.
 Are there easier ways otherwise?


So I was trying to do this, removing the module from sys.modules and
starting a new process (after modifying the file), but it doesn't work
as I expected.
The last assertion fails, but how?

The pyc file is not generated, the module is actually not in
sys.modules, and the function doesn't in the subprocess doesn't fail
but still returns the old value.
Any idea?

old_a = def ret(): return 0
new_a = def ret(): return 1


def func_no_import(queue):
queue.put(a_glob.ret())


class TestMultiProc(unittest.TestCase):

def test_reloading_with_global_import(self):
In this case the import is done before the process are started,
so we need to clean sys.modules to make sure we reload everything

queue = Queue()
open(path.join(CUR_DIR, 'old_a.py'), 'w').write(old_a)

p1 = Process(target=func_no_import, args=(queue, ))
p1.start()
p1.join()
self.assertEqual(queue.get(), 0)

open(path.join(CUR_DIR, 'old_a.py'), 'w').write(new_a)
del sys.modules['auto_tester.tests.a_glob']

p2 = Process(target=func_no_import, args=(queue, ))
p2.start()
p2.join()
self.assertEqual(queue.get(), 1)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reloading code and multiprocessing

2012-07-25 Thread andrea crotti
2012/7/23 Chris Angelico ros...@gmail.com:

 That would probably be correct. However, I still think you may be
 fighting against the language instead of playing to its strengths.

 I've never fiddled with sys.modules like that, but I know some have,
 without problem.

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list


I would also like to avoid this in general, but we have many
subprocesses to launch and some of them might take weeks, so we need
to have a process which is always running, because there is never a
point in time where we can just say let's stop everything and start again..

Anyway if there are better solutions I'm still glad to hear them, but
I would also like to keep it simple..

Another thing which now we need to figure out is how to communicate
with the live process..  For example we might want to submit something
manually, which should pass from the main process.

The first idea is to have a separate process that opens a socket and
listens for data on a local port, with a defined protocol.

Then the main process can parse these commands and run them.
Are there easier ways otherwise?
-- 
http://mail.python.org/mailman/listinfo/python-list


Dumping all the sql statements as backup

2012-07-25 Thread andrea crotti
I have some long running processes that do very long simulations which
at the end need to write things on a database.

At the moment sometimes there are network problems and we end up with
half the data on the database.

The half-data problem is probably solved easily with sessions and
sqlalchemy (a db-transaction), but still we would like to be able to
keep a backup SQL file in case something goes badly wrong and we want to
re-run it manually..

This might also be useful if we have to rollback the db for some reasons
to a previous day and we don't want to re-run the simulations..

Anyone did something similar?
It would be nice to do something like:

with CachedDatabase('backup.sql'):
# do all your things
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dumping all the sql statements as backup

2012-07-25 Thread andrea crotti
2012/7/25 Jack tdl...@gmail.com

 Since you know the content of what the sql code is, why not just build
 the sql file(s) needed and store them so that in case of a burp you can
 just execute the code file. If you don't know the exact sql code, dump
 it to a file as the statements are constructed... The only problem you
 would run into in this scenario is duplicate data, which is also easily
 solvable by using transaction-level commits to the db.
 --
 http://mail.python.org/mailman/listinfo/python-list


Yes but how do I construct them with SqlAlchemy?
One possible option I found is to enable the logging of some parts of
SqlAlchemy, and use that log, (echo=True in create_engine does
something similar) but maybe there is a better option..

But I need to filter only the insert/update/delete probably..

And in general the processes have to run independently so in case of
database connection problems I would just let them retry until it
actually works.

When the transaction actually works then in the backed up log I can
add a marker(or archive the log), to avoid replaying it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reloading code and multiprocessing

2012-07-23 Thread andrea crotti
2012/7/20 Chris Angelico ros...@gmail.com:
 On Thu, Jul 19, 2012 at 8:15 PM, andrea crotti
 andrea.crott...@gmail.com wrote:
 We need to be able to reload code on a live system.  This live system
 has a daemon process always running but it runs many subprocesses with
 multiprocessing, and the subprocesses might have a short life...
 ...
 As long as I import the code in the function and make sure to remove the
 pyc files everything seems to work..
 Are there any possible problems which I'm not seeing in this approach or
 it's safe?

 Python never promises reloading reliability, but from my understanding
 of what you've done here, it's probably safe. However, you may find
 that you're using the wrong language for the job; it depends on how
 expensive it is to spin off all those processes and ship their work to
 them. But if that's not an issue, I'd say you have something safe
 there. (Caveat: I've given this only a fairly cursory examination, and
 I'm not an expert. Others may have more to say. I just didn't want the
 resident Markov chainer to be the only one to respond!!)

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list


Thanks Chris, always nice to get a human answer ;)

Anyway the only other problem which I found is that if I start the
subprocesses after many other things are initialised, it might happen
that the reloading doesn't work correctly, is that right?

Because sys.modules will get inherited from the subprocesses and it
will not reimport what has been already imported as far as I
understood..

So or I make sure I import everything only where it is needed or (and
maybe better and more explicit) I remove manually from sys.modules all
the modules that I want to reload, what do you think?
-- 
http://mail.python.org/mailman/listinfo/python-list


reloading code and multiprocessing

2012-07-19 Thread andrea crotti
We need to be able to reload code on a live system.  This live system
has a daemon process always running but it runs many subprocesses with
multiprocessing, and the subprocesses might have a short life...

Now I found a way to reload the code successfully, as you can see from
this testcase:


def func():
from . import a
print(a.ret())


class TestMultiProc(unittest.TestCase):
def setUp(self):
open(path.join(cur_dir, 'a.py'), 'w').write(old_a)

def tearDown(self):
remove(path.join(cur_dir, 'a.py'))

def test_reloading(self):
Starting a new process gives a different result

p1 = Process(target=func)
p2 = Process(target=func)
p1.start()
res = p1.join()
open(path.join(cur_dir, 'a.py'), 'w').write(new_a)
remove(path.join(cur_dir, 'a.pyc'))

p2.start()
res = p2.join()


As long as I import the code in the function and make sure to remove the
pyc files everything seems to work..
Are there any possible problems which I'm not seeing in this approach or
it's safe?

Any other better ways otherwise?
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   >