Is something like this what you are looking for?

julia> matchall(r"(\d+?.+?\.ts)", s)
3-element Array{SubString{UTF8String},1}:
 "00001.ts"
 "00002.ts"
 "00xyz.ts"



El jueves, 24 de diciembre de 2015, 12:30:52 (UTC-6), Douglas Bates 
escribió:
>
> Short version:
>
> I have a string that contains several instances of names of the form 
> 00001.ts, 00002.ts, ...., 00xyz.ts and I want to find the last match.   
> That is, I want to find "00xyz.ts" or, alternatively, find all such names 
> in sequence..
>
> Longer version:
>
> These are file names of a series of transport stream files containing 
> audio, video, close captions, etc.  The device generating them, a Tablo 
> over-the-air video recorder, http://tabotv.com, breaks a recording into 
> many small segments with these names.  A program can query the device at a 
> particular URL and get a listing of the directory containing these 
> segments, returned as XHTML.  I want all the file names in sequence so that 
> I can download each of these files in sequence and create a single file by 
> appending them.
>
> The names each occur multiple times in the string but each one only once 
> in the form that will match r">(\d+\.ts)<". 
>
> I can think of two ways of getting these names.  One is to parse the 
> string as XHTML and walk through the object to find these names.  The other 
> is to match the regular expression in the string, extract the "captures" 
> field, match again starting at the current offset + 1, and continue until 
> there are no further matches.
>
> The XML approach is more elegant but not especially easy.  The regular 
> expression matching is reasonably straightforward to implement but more 
> fragile.
>
> Am I missing an elegant, robust approach here?
>

Reply via email to