With the list's help I have produced a workable
routine for the problem. For closure I post here the current
script. I had an embarrassingly difficult time crafting the
verb "year", btw, but enjoyed the challenge. Thank you for
your help.
******************script begins******************
NB. readjdata.ijs
NB. 7/11/8
load '~user/httpget.ijs'
require 'regex'
months =: <;._2 ]0 : 0
January
February
March
April
May
June
July
August
September
October
November
December
)
yrsmths =: 12 12&#:
NB.* year v
NB. monad triple: start year after 2000
NB. start month (eg. April = 4)
NB. number of months
NB. year 3 4 15 is start at 2003-April and contain 15 months
year =: 2000+{.+({:{.<:@(1&{)|.((12&[EMAIL PROTECTED], 12&#)&>:@{.)& [EMAIL
PROTECTED]:)
NB. month v
NB. monad triple: same inputs as year triple
month =: <:@(1&{) 12&|@+ [EMAIL PROTECTED]:
yearmonth =: ,each/@(;/@('-',.~":)@,[EMAIL PROTECTED],:months{~month)
NB. yearmonth 3 4 15
urlhead =: <'http://www.jsoftware.com/pipermail/programming/'
urltail =: <'/thread.html'
readdata =: monad define
result =: i. 0
y =. ;"1 urlhead,.y,.urltail
for_x. y
do.
temp =. httpget x
temp =. 'Messages:[^0-9]*([0-9]+)' (,.@:{:@rxmatch ];.0 ]) temp
result =. result, ". temp
end.
result
)
Note 'demo'
months{~month 3 4 15
;/@('-',.~":)@,[EMAIL PROTECTED] 3 4 15
A=:
httpget'http://www.jsoftware.com/pipermail/programming/2007-October/thread.html'
'Messages:[^0-9]*([0-9]+)' (,.@:{:@rxmatch ];.0 ]) A
$data =: readdata yearmonth 5 10 34
)
******************script ends******************
On Thu, 10 Jul 2008, Brian Schott wrote:
+ The link below is typical of ones I would like to
+ read because in produces a line for each month that gives
+ the number of messages for its month: "Messages: 357". So I
+ would like to have suggestions for how to read (at least the
+ first few lines) of all such files from the web as a text
+ file in such a way that I can extract the key line from each
+ into a single file for stats anaylysis. By all such files I
+ mean the ones that are like the following link, but for
+ which the yyyy-month phrase is all that differs.
+
+
+ http://www.jsoftware.com/pipermail/programming/2007-October/thread.html
+
+ TIA,
+
+
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm