With the list's help I have produced a workable
routine for the problem. For closure I post here the current
script. I had an embarrassingly difficult time crafting the
verb "year", btw, but enjoyed the challenge. Thank you for
your help.

******************script begins******************
NB. readjdata.ijs
NB. 7/11/8

load '~user/httpget.ijs'
require 'regex'

months =: <;._2 ]0 : 0
January
February
March
April
May
June
July
August
September
October
November
December
)

yrsmths =: 12 12&#:
NB.* year v
NB.  monad triple: start year after 2000
NB.                start month (eg. April = 4)
NB.                number of months
NB.  year 3 4 15 is start at 2003-April and contain 15 months
year =: 2000+{.+({:{.<:@(1&{)|.((12&[EMAIL PROTECTED], 12&#)&>:@{.)& [EMAIL 
PROTECTED]:)

NB. month v
NB. monad triple: same inputs as year triple
month =: <:@(1&{) 12&|@+ [EMAIL PROTECTED]:

yearmonth =: ,each/@(;/@('-',.~":)@,[EMAIL PROTECTED],:months{~month)
NB.  yearmonth 3 4 15

urlhead =: <'http://www.jsoftware.com/pipermail/programming/'
urltail =: <'/thread.html'

readdata =: monad define
  result =: i. 0
  y =. ;"1 urlhead,.y,.urltail
  for_x. y
  do.
    temp =. httpget x
    temp =. 'Messages:[^0-9]*([0-9]+)' (,.@:{:@rxmatch ];.0 ]) temp
    result =. result, ". temp
  end.
  result
)

Note 'demo'
months{~month 3 4 15
;/@('-',.~":)@,[EMAIL PROTECTED] 3 4 15

A=: 
httpget'http://www.jsoftware.com/pipermail/programming/2007-October/thread.html'
'Messages:[^0-9]*([0-9]+)' (,.@:{:@rxmatch ];.0 ]) A

$data =: readdata yearmonth 5 10 34
)
******************script   ends******************


On Thu, 10 Jul 2008, Brian Schott wrote:

+       The link below is typical of ones I would like to
+ read because in produces a line for each month that gives
+ the number of messages for its month: "Messages: 357".  So I
+ would like to have suggestions for how to read (at least the
+ first few lines) of all such files from the web as a text
+ file in such a way that I can extract the key line from each
+ into a single file for stats anaylysis. By all such files I
+ mean the ones that are like the following link, but for
+ which the yyyy-month phrase is all that differs.
+
+
+ http://www.jsoftware.com/pipermail/programming/2007-October/thread.html
+
+ TIA,
+
+
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to