Re: [PD] http, html and textfiles

2008-05-02 Thread mark edward grimm
hello,

If we WERE going to use just pd for longer text type
processing, what optimization methods would be
recommended?

Is there a particular font that PD handles better than
others?
Would it be wise to strip text of backslashes,
spaces, commas as im assuming from your post?
Can PD grab a random 'line' from a text file so as not
to have to load the whole thing?

Thanks
mark


--- Frank Barknecht [EMAIL PROTECTED] wrote:

 Hallo,
 wolfgang schwarzenbrunner hat gesagt: // wolfgang
 schwarzenbrunner wrote:
 
  i am working on a little project in which websites
 are going to be 
  parsed. well. i thought this might be a nice thing
 using the regex 
  object from zexy... the only problem i am facing
 right now is that i 
  have no idea how i could get a html file on my
 harddisk using pd 
  (something like a http browsing object)...
  
  any suggestions?
 
 Yep: Don't use Pd for text processing.
 
 Pd is good at many things, but it's not good at
 parsing and modifying
 larger amounts of text. AFAIK there still is no
 garbage collection for
 unused symbols (Pd's strings), it's
 overcomplicated to deal with
 certain characters (backslashes, spaces, commas,
 ...) when they should
 not be interpreted by Pd etc.
 
 What I would recommend is to do your text processing
 in a different
 language. Many (scripting) languages that are great
 with text can be
 used inside of Pd: Lua, Python, Java, Scheme, etc.
 Most of these also
 include or can be extended easily with nice web
 browsing tools (CURL,
 Socket, system(wget) ...). In the end you can do
 both the browsing and
 all processing in one place and then only need to
 feed the results over
 to Pd in a format, Pd can handle with more elegance
 than it can handle
 large amounts of text.
 
 Of course it depends a bit on how complex your
 project is, so you may
 get away with pure Pd as well, but IMO it's a better
 use of Pd to
 externalize the text processing to a language better
 suited.
 
 Ciao
 -- 
 Frank Barknecht
 
 ___
 PD-list@iem.at mailing list
 UNSUBSCRIBE and account-management -
 http://lists.puredata.info/listinfo/pd-list
 



mark edward grimm | m.f.a | ed.m
syracuse u. | vpa foundations | timearts
adjunct | new media consultant
megrimm.net | socialmediagroup.org  .com   
[EMAIL PROTECTED] | 315.378.2136


  



___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] http, html and textfiles

2008-05-02 Thread Martin Peach
If you have pd extended you can use the mrpeach/tcpclient to retrieve the 
page and the mrpeach/str object to snip the data on any character or 
position. You have to enter things like spaces and backslashes as their 
decimal equivalent though. An important pair is 10 13 for CR LF. It works 
better if you are accessing simple web pages, especially if you are printing 
out the characters as they arrive. Neither http or html know about fonts, 
it's all pure ascii text.

Martin



From: mark edward grimm [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: Frank Barknecht [EMAIL PROTECTED], pd-list@iem.at
Subject: Re: [PD] http, html and textfiles
Date: Fri, 2 May 2008 06:34:18 -0700 (PDT)

hello,

If we WERE going to use just pd for longer text type
processing, what optimization methods would be
recommended?

Is there a particular font that PD handles better than
others?
Would it be wise to strip text of backslashes,
spaces, commas as im assuming from your post?
Can PD grab a random 'line' from a text file so as not
to have to load the whole thing?

Thanks
mark


--- Frank Barknecht [EMAIL PROTECTED] wrote:

  Hallo,
  wolfgang schwarzenbrunner hat gesagt: // wolfgang
  schwarzenbrunner wrote:
 
   i am working on a little project in which websites
  are going to be
   parsed. well. i thought this might be a nice thing
  using the regex
   object from zexy... the only problem i am facing
  right now is that i
   have no idea how i could get a html file on my
  harddisk using pd
   (something like a http browsing object)...
  
   any suggestions?
 
  Yep: Don't use Pd for text processing.
 
  Pd is good at many things, but it's not good at
  parsing and modifying
  larger amounts of text. AFAIK there still is no
  garbage collection for
  unused symbols (Pd's strings), it's
  overcomplicated to deal with
  certain characters (backslashes, spaces, commas,
  ...) when they should
  not be interpreted by Pd etc.
 
  What I would recommend is to do your text processing
  in a different
  language. Many (scripting) languages that are great
  with text can be
  used inside of Pd: Lua, Python, Java, Scheme, etc.
  Most of these also
  include or can be extended easily with nice web
  browsing tools (CURL,
  Socket, system(wget) ...). In the end you can do
  both the browsing and
  all processing in one place and then only need to
  feed the results over
  to Pd in a format, Pd can handle with more elegance
  than it can handle
  large amounts of text.
 
  Of course it depends a bit on how complex your
  project is, so you may
  get away with pure Pd as well, but IMO it's a better
  use of Pd to
  externalize the text processing to a language better
  suited.
 
  Ciao
  --
  Frank Barknecht
 
  ___
  PD-list@iem.at mailing list
  UNSUBSCRIBE and account-management -
  http://lists.puredata.info/listinfo/pd-list
 



mark edward grimm | m.f.a | ed.m
syracuse u. | vpa foundations | timearts
adjunct | new media consultant
megrimm.net | socialmediagroup.org  .com
[EMAIL PROTECTED] | 315.378.2136






___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list



___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] http, html and textfiles

2008-05-02 Thread mark edward grimm
hey thanks on the tip, i will try that out...

 html know about fonts, 
 it's all pure ascii text.

oh yes i know this. i meant an optimized font for when
the text is in pd/gem window... i guess it wouldn't
really matter though as long as the loaded font wasn't
too complicated.

Thanks!
mark


--- Martin Peach [EMAIL PROTECTED] wrote:

 If you have pd extended you can use the
 mrpeach/tcpclient to retrieve the 
 page and the mrpeach/str object to snip the data on
 any character or 
 position. You have to enter things like spaces and
 backslashes as their 
 decimal equivalent though. An important pair is 10
 13 for CR LF. It works 
 better if you are accessing simple web pages,
 especially if you are printing 
 out the characters as they arrive. Neither http or
 html know about fonts, 
 it's all pure ascii text.
 
 Martin
 
 
 
 From: mark edward grimm [EMAIL PROTECTED]
 Reply-To: [EMAIL PROTECTED]
 To: Frank Barknecht [EMAIL PROTECTED],
 pd-list@iem.at
 Subject: Re: [PD] http, html and textfiles
 Date: Fri, 2 May 2008 06:34:18 -0700 (PDT)
 
 hello,
 
 If we WERE going to use just pd for longer text
 type
 processing, what optimization methods would be
 recommended?
 
 Is there a particular font that PD handles better
 than
 others?
 Would it be wise to strip text of backslashes,
 spaces, commas as im assuming from your post?
 Can PD grab a random 'line' from a text file so as
 not
 to have to load the whole thing?
 
 Thanks
 mark
 
 
 --- Frank Barknecht [EMAIL PROTECTED] wrote:
 
   Hallo,
   wolfgang schwarzenbrunner hat gesagt: //
 wolfgang
   schwarzenbrunner wrote:
  
i am working on a little project in which
 websites
   are going to be
parsed. well. i thought this might be a nice
 thing
   using the regex
object from zexy... the only problem i am
 facing
   right now is that i
have no idea how i could get a html file on my
   harddisk using pd
(something like a http browsing object)...
   
any suggestions?
  
   Yep: Don't use Pd for text processing.
  
   Pd is good at many things, but it's not good at
   parsing and modifying
   larger amounts of text. AFAIK there still is no
   garbage collection for
   unused symbols (Pd's strings), it's
   overcomplicated to deal with
   certain characters (backslashes, spaces, commas,
   ...) when they should
   not be interpreted by Pd etc.
  
   What I would recommend is to do your text
 processing
   in a different
   language. Many (scripting) languages that are
 great
   with text can be
   used inside of Pd: Lua, Python, Java, Scheme,
 etc.
   Most of these also
   include or can be extended easily with nice web
   browsing tools (CURL,
   Socket, system(wget) ...). In the end you can
 do
   both the browsing and
   all processing in one place and then only need
 to
   feed the results over
   to Pd in a format, Pd can handle with more
 elegance
   than it can handle
   large amounts of text.
  
   Of course it depends a bit on how complex your
   project is, so you may
   get away with pure Pd as well, but IMO it's a
 better
   use of Pd to
   externalize the text processing to a language
 better
   suited.
  
   Ciao
   --
   Frank Barknecht
  
   ___
   PD-list@iem.at mailing list
   UNSUBSCRIBE and account-management -
   http://lists.puredata.info/listinfo/pd-list
  
 
 
 
 mark edward grimm | m.f.a | ed.m
 syracuse u. | vpa foundations | timearts
 adjunct | new media consultant
 megrimm.net | socialmediagroup.org  .com
 [EMAIL PROTECTED] | 315.378.2136
 
 
 
 
 
 
 ___
 PD-list@iem.at mailing list
 UNSUBSCRIBE and account-management - 
 http://lists.puredata.info/listinfo/pd-list
 
 
 



mark edward grimm | m.f.a | ed.m
syracuse u. | vpa foundations | timearts
adjunct | new media consultant
megrimm.net | socialmediagroup.org  .com   
[EMAIL PROTECTED] | 315.378.2136


  



___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


[PD] http, html and textfiles

2008-04-28 Thread wolfgang schwarzenbrunner
hello,

i am working on a little project in which websites are going to be 
parsed. well. i thought this might be a nice thing using the regex 
object from zexy... the only problem i am facing right now is that i 
have no idea how i could get a html file on my harddisk using pd 
(something like a http browsing object)...

any suggestions?

best
wolfgang



___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] http, html and textfiles

2008-04-28 Thread Max Neupert
i'm having a déja-vu. check the thread in the archive:

Subject: Retrieving Text form a URL | Webpage


Am 29.04.2008 um 01:33 schrieb wolfgang schwarzenbrunner:

 hello,

 i am working on a little project in which websites are going to be
 parsed. well. i thought this might be a nice thing using the regex
 object from zexy... the only problem i am facing right now is that i
 have no idea how i could get a html file on my harddisk using pd
 (something like a http browsing object)...

 any suggestions?

 best
 wolfgang



 ___
 PD-list@iem.at mailing list
 UNSUBSCRIBE and account-management - http://lists.puredata.info/ 
 listinfo/pd-list


___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list


Re: [PD] http, html and textfiles

2008-04-28 Thread Frank Barknecht
Hallo,
wolfgang schwarzenbrunner hat gesagt: // wolfgang schwarzenbrunner wrote:

 i am working on a little project in which websites are going to be 
 parsed. well. i thought this might be a nice thing using the regex 
 object from zexy... the only problem i am facing right now is that i 
 have no idea how i could get a html file on my harddisk using pd 
 (something like a http browsing object)...
 
 any suggestions?

Yep: Don't use Pd for text processing.

Pd is good at many things, but it's not good at parsing and modifying
larger amounts of text. AFAIK there still is no garbage collection for
unused symbols (Pd's strings), it's overcomplicated to deal with
certain characters (backslashes, spaces, commas, ...) when they should
not be interpreted by Pd etc.

What I would recommend is to do your text processing in a different
language. Many (scripting) languages that are great with text can be
used inside of Pd: Lua, Python, Java, Scheme, etc. Most of these also
include or can be extended easily with nice web browsing tools (CURL,
Socket, system(wget) ...). In the end you can do both the browsing and
all processing in one place and then only need to feed the results over
to Pd in a format, Pd can handle with more elegance than it can handle
large amounts of text.

Of course it depends a bit on how complex your project is, so you may
get away with pure Pd as well, but IMO it's a better use of Pd to
externalize the text processing to a language better suited.

Ciao
-- 
Frank Barknecht

___
PD-list@iem.at mailing list
UNSUBSCRIBE and account-management - 
http://lists.puredata.info/listinfo/pd-list