Re: A proposal for changing pig's memory management

2009-05-19 Thread Ted Dunning
If you have a small number of long-lived large objects and a large number of
small ephemeral objects then the java collector should be in pig-heaven (as
it were).  The long-lived objects will take no time to collect and the
ephemeral objects won't be around to collect by the time the full GC
happens.

On Tue, May 19, 2009 at 3:44 PM, Alan Gates ga...@yahoo-inc.com wrote:

 Perhaps switching to large buffers instead of having many individual
 objects will address this.  But I'm concerned that if we cannot explicitly
 force data out of memory onto disk then we'll be back in the same boat of
 trusting the Java memory manager.


-- 
Ted Dunning, CTO
DeepDyve


Re: A proposal for changing pig's memory management

2009-05-14 Thread Ted Dunning
That Telegraph dataflow paper is pretty long in the tooth.  Certainly
several of their claims have little force any more (lack of non-blocking
I/O, poor thread performance, no unmap, very expensive synchronization for
uncontested locks).  It is worth that they did all of their tests on the 1.3
JVM and things have come an enormous way since then.

Certainly, it is worth having opaque contains based on byte arrays, but
isn't that pretty much what the NIO byte buffers are there to provide?
Wouldn't a virtual tuple type that was nothing more than a byte buffer, type
and an offset do almost all of what is proposed here?

On Thu, May 14, 2009 at 5:33 PM, Alan Gates ga...@yahoo-inc.com wrote:

 http://wiki.apache.org/pig/PigMemory

 Alan.



Re: Ajax library for Pig

2009-04-14 Thread Ted Dunning
Each pig program submission should involve a separate piglatin interpreter.

On Tue, Apr 14, 2009 at 2:32 PM, nitesh bhatia niteshbhatia...@gmail.comwrote:

 Hi
 Currently I am under one doubt. How this system can be designed so that
 multiple users can run same pig.
 Current scenario is  - User executes its own copy of pig.jar on shell and
 access hadoop.

 But under this system multiple users will log-in to some domain and they
 have separate sessions. Now suppose user1 submits a pig script or access
 pig. Then user2 also access pig shell. How this system will work for
 multiple users? I am not sure what can be the optimized solution.

 --nitesh



 On Wed, Apr 15, 2009 at 2:07 AM, Alan Gates ga...@yahoo-inc.com wrote:

  Would you want to contribute this to the Pig project or release it
  separately?  Either way, keep us posted on your progress.  It sounds
  interesting.
 
  Alan.
 
 
  On Apr 9, 2009, at 9:28 PM, nitesh bhatia wrote:
 
   Hi
  Thanks for the reply.
  This will be the architecture:
 
  1. Pig would be installed on some dedicated server machine (say P) with
  hadoop support.
  2. In front of it will be a web server (say S)
   2.1 A web server will consist of a dedicated tomcat server (say St) for
  handling dwr servlets.
   2.2 PigScript.js  proposed javascript.
   2.2 If user is using some other server than tomcat for presentation
 layer
  (say http for php or IIS for asp.net); the server (say Su) will appear
 in
  front of St.
 
  -Connections between Su and St will be done through PigScript.js
  - St and P will be done through dwr
  - To get the results from server, this system will be using Reverse-ajax
  calls ( i.e async call from server to browser  an inbuilt feature in
 DWR).
 
  DWR is under Apache Licence V2.
 
  --nitesh
 
  On Wed, Apr 8, 2009 at 9:11 PM, Alan Gates ga...@yahoo-inc.com wrote:
 
   Sorry if these are silly questions, but I'm not very familiar with some
  of
  these technologies.  So what you propose is that Pig would be installed
  on
  some dedicated server machine and a web server would be placed in front
  of
  it.  Then client libraries would be developed that made calls to the
 web
  server.  Would these client side libraries include presentation in the
  browser, both for user's submitting queries and receiving results?
  Also,
  pig currently does not have a server mode, thus any web server would
 have
  to
  spin off threads that ran a pig job.
 
  If the above is what you're proposing, I think it would be great.
   Opening
  up pig to more users by making it browser accessible would be nice.
 
  Alan.
 
 
  On Apr 3, 2009, at 5:36 AM, nitesh bhatia wrote:
 
  Hi
 
  Since pig is getting a lot of usage in industries and universities;
  how about adding a front-end support for Pig? The plan is to write a
  jquery/dojo type of general JavaScript/AJAX library which can be used
  over any server technologies (php, jsp, asp, etc.) to call pig
  functions over web.
 
  Direct Web Remoting (DWR- http://directwebremoting.org ), an open
  source project at Java.net gives a functionality that allows
  JavaScript in a browser to interact with Java on a server. Can we
  write a JavaScript library exclusively for Pig using DWR? I am not
  sure about licensing issues.
 
  The major advantages I can point is
  -Use of Pig over HTTP rather SSH.
  -User management will become easy as this can be handled easily using
  any
  CMS
 
  --nitesh
 
  --
  Nitesh Bhatia
  Dhirubhai Ambani Institute of Information  Communication Technology
  Gandhinagar
  Gujarat
 
  Life is never perfect. It just depends where you draw the line.
 
  visit:
  http://www.awaaaz.com - connecting through music
  http://www.volstreet.com - lets volunteer for better tomorrow
  http://www.instibuzz.com - Voice opinions, Transact easily, Have fun
 
 
 
 
 
  --
  Nitesh Bhatia
  Dhirubhai Ambani Institute of Information  Communication Technology
  Gandhinagar
  Gujarat
 
  Life is never perfect. It just depends where you draw the line.
 
  visit:
  http://www.awaaaz.com - connecting through music
  http://www.volstreet.com - lets volunteer for better tomorrow
  http://www.instibuzz.com - Voice opinions, Transact easily, Have fun
 
 
 


 --
 Nitesh Bhatia
 Dhirubhai Ambani Institute of Information  Communication Technology
 Gandhinagar
 Gujarat

 Life is never perfect. It just depends where you draw the line.

 visit:
 http://www.awaaaz.com - connecting through music
 http://www.volstreet.com - lets volunteer for better tomorrow
 http://www.instibuzz.com - Voice opinions, Transact easily, Have fun




-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)


Re: switching to different parser in Pig

2009-02-24 Thread Ted Dunning
Yes.

And one thing I should have mentioned was Chris W's thoughts along the lines
that it would be very nice to expose the logical plan to something like
Cascading so that a global restructuring could be done across more than just
Pig programs.  It works the other way as well, with it becoming possible for
Pig to execute programs expressed (conceivably) in Cascading form.

On Tue, Feb 24, 2009 at 1:27 AM, pi song pi.so...@gmail.com wrote:

 I think what Ted mentioned is more embedding Pig in
 other languages and use those languages to do loops.




-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
www.deepdyve.com
408-773-0110 ext. 738
858-414-0013 (m)
408-773-0220 (fax)


Re: switching to different parser in Pig

2009-02-20 Thread Ted Dunning
Probably nearly the same effect as you suggest.  Are the concepts at the
logical plan layer similar to those expressed in pig latin?  Or has a
significant transformation occurred by then?

On Fri, Feb 20, 2009 at 1:59 AM, pi song pi.so...@gmail.com wrote:

 Sounds good but how about exposing the logical plan layer instead? Wouldn't
 that yield the same effect?  From python for example you still can
 construct
 a logical plan and give to Pig to execute.




-- 
Ted Dunning, CTO
DeepDyve