Re: [Web-SIG] Extracting web data
BeautifulSoup is the standard response. I think lxml will not work very well unless the html is extremely nicely formatted, but I could be wrong. For what you describe I would suggest developing seat-of-the-pants heuristics -- just get the page using httplib and then use string.find liberally. I've had at least three consulting gigs solving this problems using various techniques and the general problem is quite difficult, but if you are trying to parse just a few pages in simple ways developing special purpose heuristics is pretty easy (until they redesign the pages, which they will do every so often). Best of luck, -- Aaron Watters btw: If you have lots of money to spend on this my former client connotate.com does this sort of scraping (and I developed some of the code). --- On Mon, 2/21/11, James Mills wrote: From: James Mills Subject: Re: [Web-SIG] Extracting web data To: "web-sig" Date: Monday, February 21, 2011, 7:07 PM On Mon, Feb 21, 2011 at 2:21 PM, Deb Midya wrote: Hi Python web-sig users, Thanks in advance and I am new to web-sig. I am using Python 2.6 on Windows XP. May I request you to assist me for the following please. I like to extract web data from the site (http://finance.yahoo.com, for example). The data may include Historical Prices, Key Statistics, News & Info, Headlines, etc. for a list of codes (such WOW, these are codes for company Ids). I am trying to automate the extraction of data. Is there any Python module or any assistance please? Once again, thank you very much for the time you have given. You might want to look into using eitherthe lxml or BeautifulSoup modules. cheersJames -- -- James Mills -- -- "Problems are solved by method" -Inline Attachment Follows- ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
I'm still in denial regarding Python 3 generally speaking, but it looks like something important is going on here. Could someone summarize the main points (intelligible to a Python 2 troglodyte)? thanks in advance, -- Aaron Watters === % man less less is more. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
someone remind me: where is the canonical WSGI 2 spec? I assume there is a way to "wrap" WSGI 1 applications without breaking them? Or is this the regex-->re fiasco all over again? -- Aaron Watters --- On Thu, 4/8/10, Manlio Perillo wrote: > From: Manlio Perillo > Subject: [Web-SIG] WSGI and start_response > To: "'Web SIG'" > Date: Thursday, April 8, 2010, 10:08 AM > Hi. > > Some time ago I objected the decision to remove > start_response function > from next version WSGI, using as rationale the fact that > without > start_callable, asynchronous extension are impossible to > support. > > Now I have found that removing start_response will also > make impossible > to support coroutines (or, at least, some coroutines > usage). > > Here is an example (this is the same example I posted few > days ago): > http://paste.pocoo.org/show/199202/ > > Forgetting about the write callable, the problem is that > the application > starts to yield data when tmpl.render_unicode function is > called. > > Please note that this has *nothing* to do with asynchronus > applications. > The code should work with *all* WSGI implementations. > > > In the pasted example, the Mako render_unicode function is > "turned" into > a generator, with a simple function that allows to flush > the current buffer. > > > Can someone else confirm that this code is impossible to > support in WSGI > 2.0? > > If my suspect is true, I once again object against removing > start_response. > > WSGI 1.0 is really a well designed protocol, since it is > able to support > both asynchonous application (with a custom extension) and > coroutines, > *even* if this was not considered during protocol design. > > > Thanks Manlio > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] SQLAlchemy Queries & HTML Data Grid
Randy: It seems you want a sortable HTML table that talks to a back end query engine. I don't see why this needs to be specific to SQLAlchemy. Here is a WHIFF middleware which does some of what you are talking about (the demo formatting is basic/ugly for simplicity purposes). Demo http://whiffdoc.appspot.com/tests/misc/testSortTable Demo source http://whiffdoc.appspot.com/tests/showText?path=./misc/testSortTable The documentation http://whiffdoc.appspot.com/docs/W1200_1400.stdMiddleware#Header83 is not extensive, but here is the source for the core middleware widget. http://aaron.oirt.rutgers.edu/cgi-bin/whiffRepo.cgi/file/8c031c68a5a0/whiff/middleware/sortTable.py As written it requires the whole table as a list of dictionaries and then does paging from the full list. It certainly needs generalization but maybe it's a start. Let me know if you have questions or comments. -- Aaron Watters --- On Wed, 4/7/10, Randy Syring wrote: From: Randy Syring Subject: Re: [Web-SIG] SQLAlchemy Queries & HTML Data Grid To: "Aaron Watters" Cc: web-sig@python.org Date: Wednesday, April 7, 2010, 1:37 PM Aaron, Sorry, I must not really have explained clearly. This isn't an abstraction layer, but more like a UI component or widgit that facilities basic reporting. Look at these pages: http://www.redmine.org/issues http://trac.edgewall.org/query] ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] IIS and Python CGI - how do I see more than just the form data?
--- On Tue, 4/6/10, J.D. Main wrote: > From: J.D. Main > Subject: Re: [Web-SIG] IIS and Python CGI - how do I see more than just the > form data? > To: web-sig@python.org > Date: Tuesday, April 6, 2010, 9:25 PM > Thanks Aaron, > > I think I will explore the WSGI interface. However, I > did learn a trick using > the OS Module: > > import cgi, os > > formfields = cgi.FieldStorage() > http_stuff = os.environ . Yes, that will work too. In fact the CGI interface to WSGI works like this. The advantage to using WSGI is that it makes it possible to move your application to other configurations more easily (in theory) and it's just a tiny bit more high level. Best regards, -- Aaron Watters ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] SQLAlchemy Queries & HTML Data Grid
Thanks Randy, very interesting. My initial reaction is that you are building a stack on top of a stack. It's not clear to me what problem you want to solve and your requirements are. It's possible that you could find it easier to abstract directly on top of SQL or alternatively you could consider using another sort of data model like mongodb. Building an abstraction on top of SQLAlchemy which hasn't even reached 1.0 strikes me as dubious. Thanks again, -- Aaron Watters --- On Tue, 4/6/10, Randy Syring wrote: From: Randy Syring Subject: [Web-SIG] SQLAlchemy Queries & HTML Data Grid To: web-sig@python.org Date: Tuesday, April 6, 2010, 4:37 PM I am planning on building a library that will facilitate creation of custom queries and html display of resulting datasets from SQLAlechemy queries. I have some basic work done here: https://svn.rcslocal.com:8443/svn/pysmvt/pysapp/branches/0.1/pysapp/modules/datagrid/ But I don't like the API and I don't want the library to be dependent on pysapp. Furthermore, I would like to have a more verbose querying ability akin to Redmine: http://www.redmine.org/projects/redmine/issues Including: Filters Column Selection Grouping (multiple levels) Sorting (multiple columns) some kind of query saving/loading mechanism with a flexible backend I have done some basic table generation work here: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/ with the tests being the best place to get an idea of how it works: https://svn.rcslocal.com:8443/svn/pysmvt/pysdatagrid/trunk/pysdatagrid/tests/test_render.py Looking for comments, pointers to other projects, and/or possibly interest in helping with a project like this. I am currently working in SVN but will most likely move to hg/git if there are others who are interested. -- -- Randy Syring Intelicom 502-644-4776 "Whether, then, you eat or drink or whatever you do, do all to the glory of God." 1 Cor 10:31 -Inline Attachment Follows- ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] IIS and Python CGI - how do I see more than just the form data?
I think you should consider using the WSGI interface. The WSGI interface puts all the components of a request into a request environment dictionary which is sent as a parameter to the function generating the response. For example have a look at the test application http://whiffdoc.appspot.com/tests/misc/testDebugDump?thisVar=thatValue&thisOtherVar=ThatOtherValue which dumps out the WSGI environment (with WHIFF extensions) to the response. All the information you need is somewhere inside the environment dictionary (but it's not always easy to find). You could also look at WHIFF which helps combine some of the features of the CGI module with the WSGI interface. http://whiffdoc.appspot.com/ Hope that helps, -- Aaron Watters === % man less less is more. --- On Sat, 4/3/10, J.D. Main wrote: > From: J.D. Main > Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form > data? > To: web-sig@python.org > Date: Saturday, April 3, 2010, 12:32 PM > Hi Folks, > > I hope this question hasn't already been answered... > > I'm using IIS 5 and calling a python script directly in the > URL of a request. > Something like: > > http://someserver/myscript.py > > or even > > http://someserver/myscript.py?var1=something&var2=somthingelse > > Using the CGI module, I can certainly see and act upon the > variables that > are passed as GET or POST actions. What I'm > after is something more > low level. I want to see the entire HTTP request with > everything inside it. > > Does IIS actually pass that information to the CGI > application or does it just > pass the variables? > > The intent is to write a "RESTFUL" CGI script. I need > to actually "see" the > URI and the parameters of the incoming request to map the > appropriate > action. Without short circuiting the IIS webserver, > how would my python > parse the following: > > http://someserver/someapp/someuser/someupdate?var1=Charlie > > Thanks in advance! > > JDM > > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
... > It is all very silly. Technically a file like object is not > required > to have a 'closed' attribute, so that code expecting it was > wrong in > the first place. > All you can really do is supply as many of the methods, > attributes as > possible, all required and as many as optional as makes > sense, because > you can't trust people to read the documentation properly. I don't think it's silly. This is why static typing is useful. I'm involved in migrating a large amount of java code to a new base right now and I have to say that it's pretty nice to know that if it compiles it will most likely work. playing devil's advocate -- Aaron Watters === % ping elvis elvis is alive ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Generic configuration
WHIFF has a concept of configurable resources for application groups, if you care to take a look. concept: http://whiffdoc.appspot.com/docs/W1000_1000.resources deployment API: http://whiffdoc.appspot.com/docs/W1200_1000.DirectoryConfig access API: http://whiffdoc.appspot.com/docs/W1200_1300.applicationAPI#Header8 example usage: http://whiffdoc.appspot.com/docs/W1100_1200.wwiki#Header7 I'm currently extending this paradigm to allow resource allocation with access control support. That should be in the next release. -- Aaron Watters --- On Tue, 3/16/10, Alex Morega wrote: > From: Alex Morega > Subject: [Web-SIG] Generic configuration > To: web-sig@python.org > Date: Tuesday, March 16, 2010, 11:35 AM > Hello, > > This is not really a WSGI question, it's more into general > configuration, but I don't know of a better place to ask > it. > > Paster config files allow you to hook up WSGI applications, > middleware, and a server, plus some (undocumented?) magic > configuration of the logging module. But what about random > components, like a database? Ideally I'd like to specify a > factory for database connections and give it some > parameters; this would return a reference to a new database > connection. I could then pass this reference to my wsgi app > or middleware. > > Apparently the pattern is to perform this database > configuration as part of a wsgi middleware, but that feels > unnatural. Or one could do this outside of the paste > configuration file, but that just splits the configuration > needlessly into several pieces. Am I missing something > obvious? > > Thanks, > -- Alex > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification
I had to implement something like this for WHIFF. I think path dispatch considerations do not belong at the level of the WSGI spec. Higher level layers should worry about exactly how the URL gets dispatched within the application. The higher layers can add environment entries as needed, like "whiff.entry_point" and "whiff.template_path" etc. Or maybe I misunderstand something. -- Aaron Watters --- On Wed, 1/6/10, Gustavo Narea wrote: > From: Gustavo Narea > Subject: Re: [Web-SIG] wsgiorg.routing_path addition to the > wsgiorg.routing_args Specification > To: web-sig@python.org > Date: Wednesday, January 6, 2010, 5:42 AM > Is it a really bad suggestion? :( > > - G. > > On Mon, Jan 4, 2010 at 11:31 PM, > Gustavo Narea > wrote: > > Hello everybody. > > > > The current wsgiorg.routing_args specification requires > that "Portions of the > > path that have been parsed should still be moved to > SCRIPT_NAME (and removed > > from PATH_INFO)", but: > > > > 1.- That's against semantics. According to PEP 333 > and the CGI spec, > > SCRIPT_NAME and PATH_INFO must represent the path where the > (WSGI) application > > is "mounted" and the location of the > request's target, respectively. > > 2.- It's not possible to reconstruct URLs reliably. > After these variables > > have been modified, any attempt to reconstruct the home > page's URL will be > > erroneous, for example. > > 3.- PATH_INFO will end up useless in many requests. For > example, if a request > > matches the pattern "/posts/{article_title}/", > these variables would have the > > following values: > > SCRIPT_NAME = "/blog/posts/hello-world" > > PATH_INFO = "/" > > > > I understand the reasoning behind a "cleaner" > path, but I think taking data > > out of the PATH_INFO is not the best approach. Even if we > only remove the > > matches alone, retaining the characters in between (instead > of taking > > everything up to the last position of the match), we'd > only be solving the > > third problem. > > > > So I'd like to propose the introduction of a new > variable in the WSGI > > environment, wsgiorg.routing_path, which would be the > PATH_INFO with all the > > arguments removed. > > > > Dispatchers would not have to modify SCRIPT_NAME or > PATH_INFO. Instead, they > > should: > > 1.- Take the arguments from PATH_INFO and put them into > wsgiorg.routing_args > > (as they do now). > > 2.- Store the PATH_INFO without arguments in > wsgiorg.routing_path. > > > > Example 1 > > - > > Pattern = "/posts/{article_title}/" > > PATH_INFO = "/posts/hello-world/" > > wsgiorg.routing_args = ((), {'article_title': > "hello-world"}) > > wsgiorg.routing_path = "/posts/" > > > > Example 2 > > - > > Pattern = "/posts/{article_title}/edit" > > PATH_INFO = "/posts/hello-world/edit" > > wsgiorg.routing_args = ((), {'article_title': > "hello-world"}) > > wsgiorg.routing_path = "/posts/edit" > > > > This information would be useful in a number of situations, > such as: > > > > 1.- An authorization framework could allow developers to > write access > > controls based on the arguments-free path (i.e., > wsgiorg.routing_path) and > > then use the arguments (in wsgiorg.routing_args) for more > specific controls > > (if any). > > 2.- Templates can change automatically depending on the > arguments-free path. > > > > .. which are not possible at present. > > > > What do you think about this? > > > > Cheers. > > -- > > Gustavo Narea . > > | Tech blog: =Gustavo/(+blog)/tech ~ About me: > =Gustavo/about | > > > > > -Inline Attachment Follows- > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi write=start_response() and iterable return?
--- On Mon, 1/4/10, P.J. Eby wrote: > From: P.J. Eby > Subject: Re: [Web-SIG] wsgi write=start_response() and iterable return? > To: "Aaron Watters" , web-sig@python.org > Date: Monday, January 4, 2010, 4:38 PM > At 08:42 AM 1/4/2010 -0800, Aaron > Watters wrote: > > > > From: Aaron Watters > > > > > > If an application returns an iterable response > and *also* > > > calls the write()... what is supposed to happen? > > > >After carefully considering all the responses on this > issue ;c) > >I came up with the following strategy for dealing with > calls to > >write() in combination with an iterable response: > see > > > > http://listtree.appspot.com/listtreeNotes/qFxCJOYB2xkf2vyQS5L$AA > > > >This wrapper implementation diverts calls to write() > into the iterable > >response so the rest of the system can ignore the > write() > >function(). I'd be very happy if some of you > would take a quick > >look and see if this makes sense to you. > > Do note that an application which calls write() from an > iterator body > is *not* WSGI compliant, as described under: > > In practice, however, wsgiref.handlers treats write() and > yield as > interchangeable, and wsgiref.validate doesn't complain if > an > application calls write() from within an iteration.. > :-( ... And I'm not sure that a complicated program might not do this even it was not intended if it uses both idioms. In fact since WHIFF is designed to combine external components you probably can confuse WHIFF (or other intrastructure tools) into mixing the modes even if the tool doesn't mix modes directly. I'd like to see the write() callable removed from future versions of WSGI, and wrappers like the one I referenced could provide backwards compatibility for old style apps. -- Aaron Watters === less is more ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi write=start_response() and iterable return?
> From: Aaron Watters > > If an application returns an iterable response and *also* > calls the write()... what is supposed to happen? After carefully considering all the responses on this issue ;c) I came up with the following strategy for dealing with calls to write() in combination with an iterable response: see http://listtree.appspot.com/listtreeNotes/qFxCJOYB2xkf2vyQS5L$AA This wrapper implementation diverts calls to write() into the iterable response so the rest of the system can ignore the write() function(). I'd be very happy if some of you would take a quick look and see if this makes sense to you. Thanks in advance, -- Aaron Watters === less is more ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] wsgi write=start_response() and iterable return?
> S4. An I/O error (such as Connection reset by peer or > Broken pipe) > occurs when writing to the write() callable returned by > start_response(). Interesting. I had totally missed the write() callable return value required by start_response. If an application returns an iterable response and *also* calls the write()... what is supposed to happen? Yikes. This may require some careful adjustments to WHIFF. I had run into this and hacked around it on an ad hoc basis assuming it was a mistake. -- Aaron Watters === less is more ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] CGI WSGI and Unicode
--- On Mon, 12/7/09, Graham Dumpleton wrote: > For the record, CGI/WSGI adapters should also protect the > original > stdin/stdout so WSGI application doesn't cause problems by > using > 'print' or do other odd stuff with input. I haven't seen a > single > CGI/WSGI adapter which does it in a way that I would say is > correct, > or at least robust against users doing stupid things... "There is no fool proof software: fools are too clever" "Doctor, it hurts when I do this." "Don't do that." Some words of wisdom from folklore... (or if anyone knows the correct attribution, please inform). -- Aaron Watters http://listtree.appspot.com http://whiffdoc.appspot.com === an apple every 8 hours will keep 3 doctors away. -- kliban ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] http://wiki.python.org/moin/WebFrameworks
> > On Thu, Nov 26, 2009 at 1:02 PM, Chris McDonough > wrote: > > http://wiki.python.org/moin/WebFrameworks > seems to be the place where folks > > are registering their respective web frameworks. > > > > I'd like to move some of the frameworks which are > currently in the various > > categories which haven't been active in a few years. > In particular, I'd > > like to move any framework which hasn't had a release > since the beginning of > > 2008 (arbitrary) into the "Discontinued / Inactive" > framework category. I'd > > be willing to do the work to make sure I wasn't moving > one that actually > > *did* have releases past that but just hadn't updated > the page. > > > > Any dissent? > > > > - C Why not call them "apparently stable" versus "under active development"? Is the cgi module "discontinued"? I'm a little sensitive on this topic because people tell me that Gadfly is "inactive" or "discontinued" but it still does what it does as documented very well. Frequent releases may actually be a sign of bugginess and bad design. If you suspect a project is really dead, maybe you could try to contact the authors and ask about what they think. -- Aaron Watters === BTW, I think "Release early, release often" is nonsense because it means you are probably releasing something buggy and unstable which will just alienate your users, who will never come back to see the better version. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
I second the move, recorded here: http://listtree.appspot.com/wsgi2/ICvaujouPxb2gfEhDS_aiw -- Aaron Watters --- On Thu, 11/26/09, James Y Knight wrote: > From: James Y Knight > Subject: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec > To: "Web SIG" > Date: Thursday, November 26, 2009, 8:42 PM > I move to bless mod_wsgi's definition > of WSGI 1.1 [1] as the official definition of WSGI 1.1, > which describes how to implement WSGI adapters for both > Python 2.x and 3.x. It may not be perfect, but, it's been > implemented twice, and seems ot have no fatal flaws (it > doesn't do any lossy transforms, so any issues are > irritations at worst). The basis for this definition is also > described in the "WSGI 1.0 Ammendments" [2] page. > > The definitions as they stand are clear enough to > understand and implement, but not currently in spec-worthy > language. (e.g. it says "should" and "may" in a colloquial > fashion, but actually means MUST in some places and SHOULD > in others, as defined by RFC 2119) > > Thus, I'd like to suggest that Graham (if he's willing?) > should reformat the "Definition"/"Ammendments" as an actual > diff against the current PEP 333. Then, I will recommend > adopting that document as an actual standard WSGI 1.1, to > replace PEP 333. > > This discussion has gone on long enough, and it doesn't > really matter as much to have the perfect API, as it does to > have a standard. > > James > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X > [2] http://www.wsgi.org/wsgi/Amendments_1.0 > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Future of WSGI
--- On Wed, 11/25/09, Chris Dent wrote: > From: Chris Dent > I can (barely) relate to some of the complaints that > start_response is a pain in the ass, but environ, to me, is > not broken. I agree. It maps nicely onto the underlying protocol and WSGI is supposed to be low level right? The biggest problem with start_response is that after you evaluate iterable = application(env, start_response) Sometimes the start_response has been called and sometimes it hasn't, and this can break middlewares when they haven't been tested both ways (repose.who for example seems to assume it has been called). By the way, I created a little interface for archiving notes about wsgi2 here http://listtree.appspot.com/wsgi2 To add to it you need to fill in a captcha and use the password "wsgi". I thought I announced this to web-sig yesterday, but apparently I messed up my reply-to. If you like, please add something there. I'd be delighted if you did. I think it might be an interface that is easier to "scan" than a million emails. -- Aaron Watters ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] API to add a tree viewer / navigator to a web document ?
WHIFF now has treeviews. I will document the usage in a tutorial soonish. In the mean time here are the test/demos reloading frame variant: http://aaron.oirt.rutgers.edu/myapp/root/misc/frameTest non-reloading ajax variant: http://aaron.oirt.rutgers.edu/myapp/root/misc/ajaxTest The implementation supports large externally stored trees. Source for the implementations and demos are available from the WHIFF mercurial archive http://aaron.oirt.rutgers.edu/cgi-bin/whiffRepo.cgi Hope you like! -- Aaron Watters === "keep off the grass" Peter Ustinov's requested gravestone epitaph. --- On Fri, 10/2/09, denis wrote: > From: denis > Subject: [Web-SIG] API to add a tree viewer / navigator to a web document ? > To: web-sig@python.org > Date: Friday, October 2, 2009, 7:29 AM > Folks, > for one of you experts, this must be trivial / must > exist already > within some big Python-web package: > > say I'm looking at a long web doc.html which has no tree > view on the > left, > but I can hack a local tree view file with level, name, > href like > > + 1 US href= ("+" button expands, "-" folds) > 2 Alabama href= > 3 ... > 2 Alaska href= > ... > + 1 Canada href= > ... > > Is there a small API that can generate a tree viewer / > navigator from > this, > either side by side in the same browser window with the > remote web > pages, or in a separate window ? > (The tree view lines can of course be reformatted to xml or > whatever > the API wants.) > > There are really 2 APIs here: > a) class TreeView > b) display the tree view and the remote web page in split > windows. > > (Is there a general introduction to Python-webbing > for someone who knows Python but almost no CSS nor web > services ?) > > Thanks, cheers > -- denis > > > > > > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Web Framework
--- On Sun, 5/31/09, Omar Munk wrote: > From: Omar Munk > Subject: [Web-SIG] Web Framework > To: web-sig@python.org > Date: Sunday, May 31, 2009, 12:30 PM > Hello > A good documentation. > Not to overkill like > Django > Easy and simple > Just something like > PHP but without the dirty style. > I like Karrigell > but it looks like it's dead do you know a clone of > it? Hi Omar. Please have a look at WHIFF. It has a lot of PHP-like features, but it's better :). Please let me know what you think of it. I'd be especially interested if you find the documentation hard to understand -- please let me know where you got confused or whatever. Thanks. http://aaron.oirt.rutgers.edu/myapp/docs/W.intro By the way I agree that PHP is ugly and most Python frameworks are too complicated. I also think the only reason PHP is so popular is because there was never an appropriate Python based alternative for the kinds of things people like to do with PHP. This is the vacuum I'm trying to fill with WHIFF. -- Aaron Watters === less is more ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Getting back to WSGI grass roots.
--- On Wed, 9/23/09, René Dudfield wrote: > Application portability is the main wsgi use case. I > think that > requires a number of things that wsgi doesn't provide - > wsgi knows > nothing of data stores for example. Application > portability is the > main thing we should be interested in, and strive for > it. Not just on > web servers, but on web frameworks too. Perhaps there should be a notion of managed application resources built as another layer on top of WSGI so you can easily switch between storing things in MySQL or the file system or Google App Engine data store or whatever. Perhaps something along these lines? http://aaron.oirt.rutgers.edu/myapp/docs/W1000_1000.resources > There's no way I can take any python web application, copy > the files > onto any python web server and have it work. php can > do this, but we > still can not do this with python. I can :). This doesn't require changes to WSGI, however, just appropriate additional layers on top of WSGI which you can call WSGI++ or give another name -- I don't know which is better -- ask a marketing person. -- Aaron Watters http://aaron.oirt.rutgers.edu/myapp/docs/W1100_1400.calc === Little birds are playing Bagpipes on the shore Where the tourists snore "Thanks!" they say, "'Tis thrilling!" "Take, oh take this shilling!" "Let us have no more!" -- Lewis Carroll ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Getting back to WSGI grass roots.
--- On Wed, 9/23/09, Graham Dumpleton wrote: > So, rather than throw away completely the idea of bytes > everywhere, > and rewrite the WSGI specification, we could instead say > that the > existing conceptual idea of WSGI 1.0 is still valid, and > just build on > top of it a translation interface to present that as > unicode. Seconded. There should be a lower level that talks bytes and a higher level that talks unicode or whatever. There should also be a way for even higher levels to reach down to the lower level to see the bytes before they got misdecoded by the unicode layer because this will likely be needed in some cases. Is there anything wrong with just adding decoded interpretations to the WSGI environment as separate entries? Also, everything should be as orthogonal as possible. One problem I have with most Web tools and frameworks is they tend to take over and do everything at once when I really only want a little bit of help. WSGI 1 is nice because it just abstracts HTTP and stops there. It was a beautiful piece of work. Kudos. -- Aaron Watters http://aaron.oirt.rutgers.edu/myapp/docs/W1100_1600.openFlashCharts == All problems in computer science can be solved by another level of indirection -- David Wheeler [of course the Java folk have proven over and over again that you can have too many layers...] ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Developer authentication spec
I like the general idea of a dev-auth spec. Of course as you know developer mode is extremely dangerous. I would prefer that the spec made getting into developer mode more difficult so that people who don't know what they are doing have to jump through a couple hoops before they hang themselves. Just shoving env['x-wsgiorg.developer_user'] = 'xxx' should not be enough to enable dev mode, and as I read the spec this might be enough. For example the environment value could be required to be an instance of wsgi.DeveloperUser(name) (or whatever) instead of a string so that some bit of Python at least is forced to import the right module and initialize an object before developer mode will work. This will probably be harder to do accidentally from a config file. Alternatively, the spec could say something explicit about the 'client' and the 'server' having a shared secret -- maybe by requiring the client to send a timestamp hashed using a password or something. Any level of safeguard to make it harder for the ignorant innocent bystander sysadmin to shoot himself in the foot might be nice. I'm particularly concerned because as it stands the spec might allow a WHIFF "remote procedure call" to get itself into developer mode automatically, with no password required, unless I add something explicit to prevent it. Just my 2p. -- Aaron Watters === http://aaron.oirt.rutgers.edu/myapp/docs/W1200_1400.stdMiddleware ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Announcing bobo
--- On Sun, 6/21/09, Graham Dumpleton wrote: > In other words, bobo doesn't need to support that sort of > mechanism > directly as Apache/mod_wsgi makes it a reasonably trivial > thing to do. Nice! Ideally the same mechanism should work for other WSGI implementations, and the same mechanism should also work for "dropping" (fragments of) applications into another bobo application. For example you can "drop in" middleware support for amCharts flash charts into any whiff application by adding an import statement to an __init__.py file, as described here http://aaron.oirt.rutgers.edu/myapp/amcharts/doc -- Aaron Watters === % if i had a ( for every $ bush spent, how many ('s would i have? too many ('s. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Announcing bobo
[whoops: graham, I meant to 'reply all', sorry for repeat] --- On Sat, 6/20/09, Graham Dumpleton wrote > Personally I believe 'bobo' would be a good alternative for > the people > who currently get drawn to the simplicity of 'publisher' It's not clear to me whether bobo has the "just drop a file or directory in the parent directory and it works" property like PHP, modpy/publisher, vanilla CGI, and WHIFF do. Does it, Jim? I implemented WHIFF primarily because I wanted this property (and some others). WHIFF generalizes the "droppable file or directory" concept to allow an "application" to implement a fragment of a page or a middleware easily as well. Please see http://aaron.oirt.rutgers.edu/myapp/docs/W1100_1050.Grading http://aaron.oirt.rutgers.edu/myapp/docs/W1100_1200.wwiki for examples of how this works in WHIFF. -- Aaron Watters === % man woman Segmentation fault. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Announcing bobo
--- On Wed, 6/17/09, Sergey Schetinin wrote: > When considering webapps and what urls they should handle > it seems > like the same should apply -- webapps define contained > blocks of > functionality and the task of placing them somewhere in > URL-space > belongs to the "caller" which in this case would be a > configuration or > serving script or, most often, a parent application. So you seem to be suggesting that a web component should not be aware of its URL in the same sense that an object is not aware of its variable name in the scope of the application that is using the object. Is that right? In particular you should be able to assign a component to any URL in the same sense that you can give an object any name. You should also be able to build relocatable URL trees which can be "mounted" anywhere in the "calling" application suite. Do I catch your meaning correctly? It's not clear to me whether Bobo allows or disallows this (I think whiff, and standard cgi, for two examples, support this sort of url handling). -- Aaron Watters http://aaron.oirt.rutgers.edu/myapp/docs/W1500.whyIsWhiffCool === Emacs Makes A Computer Slow ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Announcing bobo
Re Jim's summary of URL routing > ...I hope this helps ... It helped me. Interesting summary. Thanks. I'm all for making things as simple and explicit as possible (at least as an optional or default behavior) and it looks like Bobo is much better than many other approaches in supporting simple and explicit URL routing. Thanks again, Jim. -- Aaron Watters http://aaron.oirt.rutgers.edu/myapp/docs/W1500.whyIsWhiffCool#Header6 http://aaron.oirt.rutgers.edu/myapp/amcharts/doc === Everything should be as simple as possible. But not simpler. -- Einstein ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Announcing bobo
> Nobody minds calling functions from other > functions -- > that's basics of programming, but for some reason URL > dispatching is > seen as something different. Why? I don't know, but every time I see a strange data structure with regular expressions in it that is supposed to define my web application URL structure, my skin crawls. I think the simplicity of FILE_PATH==URL is one of the main reasons for the popularity of PHP -- other than that I can't think of any excuse for PHP. If python had a framework that had a simple and straightforward organization 5 to 10 years ago I don't think either PHP or Ruby/Rails would have ever evolved. BTW, this was one of the primary reasons I created WHIFF -- I wanted a structure where I just dropped files into a directory and they were automatically treated as applications or middlewares with an URL derived from the file path with associated services which could be combined in a natural fashion... There wasn't anything like that available, so I had to create it. -- Aaron Watters === http://aaron.oirt.rutgers.edu/myapp/amcharts/doc http://aaron.oirt.rutgers.edu/myapp/docs/W1500.whyIsWhiffCool less is more. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Closing long-running WSGI requests (possible?)
I agree with Ionel I personally wouldn't rely on "kill wsgi request". I'd run the update in a subprocess and kill the subprocess using a signal when the user requests (on unix, of course). I'd also check a log written by the subprocess to see whether it completed or not. If you "kill the wsgi request" you have the problem of not being quite sure whether the kill arrived in time, among other possible difficulties, some mentioned by Ionel. -- Aaron Watters http://aaron.oirt.rutgers.edu/myapp/docs/W0500.quickstart (apologies to Christian, who got this twice, I forgot to "reply all") --- On Mon, 4/13/09, Ionel Maries Cristian wrote: > From: Ionel Maries Cristian > Subject: Re: [Web-SIG] Closing long-running WSGI requests (possible?) > To: "Christian Wyglendowski" > Cc: "Chimezie Ogbuji" , web-sig@python.org > Date: Monday, April 13, 2009, 12:01 PM > That implies one would have extremely > reliable tcp connections, and clients > graciously shutdown the connection and the server is > notified of that. > > Most of the time that doesn't happen and the solution > is to continuously send > > keepalive packets (some small string or whatever) - I'm > assuming you run > a batch a set of queries and you can interleave yielding > some data while > you run that batch. > > For example if your client disconnects and the servers > tries to send some data > > it would fail - and trigger closing the app iterable. > > In contrast a server that just runs some backend processing > without moving > any data around doesn't have any way to know if the > connection is still valid. > > > Then again, even if the client properly shutdown the > connection the server > won't do anything about it if it doesn't try to do > anything with the socket due > to the synchronous nature (I'm assuming) of the whole > server/app. > > > -- ionel > > > > > On Mon, Apr 13, 2009 at 17:53, > Christian Wyglendowski > wrote: > > On Mon, Apr 13, 2009 at 10:40 AM, Chimezie > Ogbuji > wrote: > > > Hello. I have a problem with a WSGI-based SPARQL > server that I have been > > > unable to resolve for some time. I was told this is > the best place to ask > > > :). I'm building a SPARQL [1] server that is > deployed as WSGI/Paste > > > server. SPARQL queries are handled by the server and > evaluated against a > > > MySQL database using mysql-python/MySQLdb to manage > the connection. > > > > > > My goal is to be able to allow clients to close the > connection in order to > > > kill queries that have been dispatched (in order to > 'abort' them). > > > > This should be doable from what I understand. From > PEP 333: > > > > "If the iterable returned by the application has a > close() method, the > > server or gateway must call that method upon completion of > the current > > request, whether the request was completed normally, or > terminated > > early due to an error. (This is to support resource release > by the > > application. This protocol is intended to complement PEP > 325's > > generator support, and other common iterables with close() > methods." > > [1] > > > > So it sounds like you could add a close method on whatever > iterable > > that your application returns and have it do the required > resource > > release there. > > > > HTH, > > > > Christian > > http://www.dowski.com > > > > [1] http://www.python.org/dev/peps/pep-0333/#specification-details > > ___ > > Web-SIG mailing list > > Web-SIG@python.org > > Web SIG: http://www.python.org/sigs/web-sig > > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/ionel.mc%40gmail.com > > > > > -Inline Attachment Follows- > > ___ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Please look at WHIFF -- WSGI/HTTP INTEGRATED FILESYSTEM FRAMES
Hi folks, I tried this announcement on some easy going lists yesterday and no one has taken me to the woodshed yet, so I thought I'd have a go at a tougher crowd. I'm releasing a WSGI component suite called WHIFF and I'd just love it if you folks would have a look and comment/suggest/criticize/complain. If you'd like to try it out -- even better. Please go http://whiff.sourceforge.net Or use one of the links in the announcement below. Thanks -- Aaron Watters === THIS .SIG IS INTENTIONALLY LEFT BLANK === WHIFF -- WSGI/HTTP INTEGRATED FILESYSTEM FRAMES WHIFF is an infrastructure for easily building complex Python/WSGI Web applications by combining smaller and simpler WSGI components organized within file system trees. To DOWNLOAD WHIFF go to the WHIFF project information page at http://sourceforge.net/projects/whiff and follow the download instructions. To GET THE LATEST WHIFF clone the WHIFF Mercurial repository located at http://aaron.oirt.rutgers.edu/cgi-bin/whiffRepo.cgi. To READ ABOUT WHIFF view the WHIFF documentation at http://aaron.oirt.rutgers.edu/myapp/docs/W.intro. To PLAY WITH WHIFF try the demos listed in the demos page at http://aaron.oirt.rutgers.edu/myapp/docs/W1300.testAndDemo. Why WHIFF? == WHIFF (WSGI HTTP Integrated Filesystem Frames) is intended to make it easier to create, deploy, and maintain large and complex Python based WSGI Web applications. I created WHIFF to address complexity issues I encounter when creating and fixing sophisticated Web applications which include complex database interactions and dynamic features such as AJAX (Asynchronous JavaScript and XML). The primary tools which reduce complexity are an infrastructure for managing web application name spaces, a configuration template language for wiring named components into an application, and an applications programmer interface for accessing named components from Python and javascript modules. All supporting conventions and tools offered by WHIFF are optional. WHIFF is designed to work well with other modules conformant to the WSGI (Web Service Gateway Interface) standard. Developers and designers are free to use those WHIFF tools that work for them and ignore or replace the others. WHIFF does not provide a "packaged cake mix" for baking a web application. Instead WHIFF is designed to provide a set of ingredients which can be easily combined to make web applications (with no need to refine your own sugar or mill your own wheat). I hope you like it. -- Aaron Watters ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com