Re: URL lengths

2017-11-30 Thread Steve Fryatt
On 30 Nov, Tim Hill wrote in message
<56a3223323...@timil.com>:

> I just can't believe that a URL can contain 33% more letters than the King
> James Bible's 3,116,480.
> 
> "four gigabytes per URL"
> 
> LOL

In RISC OS terms, it's another way of saying "limited by available
memory"...

-- 
Steve Fryatt - Leeds, England

http://www.stevefryatt.org.uk/



Re: URL lengths

2017-11-30 Thread Steve Fryatt
On 30 Nov, Harriet Bazley wrote in message
:

> On 28 Nov 2017 as I do recall,
>   Daniel Silverstone  wrote:
>
> > I don't believe we limit URL length per-se, though they get interned and
> > as such four gigabytes per URL is probably the absolute limit.  In
> > addition, POST data is nominally unlimited though I believe we have a
> > similar four gigabyte limit.
> 
> I had an error today from Netsurf, reporting that a URL was too long to
> display (although it seemed to work).

Due to the problems that the Wimp has changing the allocation of writable
icon buffers, there's an arbitrary limit (255 characters, perhaps?) in the
RISC OS front-end's URL bar. If the core tries to display a longer URL, the
bar is just cleared and you get a warning -- but it doesn't otherwise affect
the browser's operation.

This obviously means that there's a limit to the size of URL that you can
type in, too. But that's it. You can follow any length of link in NetSurf,
and launch any length of URL via the launch protocols.

All subject to that 4GB limit, of course.

-- 
Steve Fryatt - Leeds, England

http://www.stevefryatt.org.uk/



Re: URL lengths

2017-11-30 Thread Richard Torrens (lists)
In article <56a1aa7e00joh...@ukgateway.net>,
   John Williams  wrote:
> What is the maximum URL length (including POST data) that NetSurf can
> handle?

What may be relevant is the following entry from my web server error
logs:

2017-10-02 02:57:45: (response.c.553) file not found ... or so:  File name
too long
/YesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreSc
 ann ->  

At which point the Web server (lighttpd) logs it as an error.

-- 
Richard Torrens.
http://www.Torrens.org for genealogy, natural history, wild food, walks, cats
and more!



Re: URL lengths

2017-11-30 Thread Rob Kendrick
On Thu, Nov 30, 2017 at 02:32:29PM +, Tim Hill wrote:
> 
> I just can't believe that a URL can contain 33% more letters than the
> King James Bible's 3,116,480.
> 
> "four gigabytes per URL"

It's probably much greater than that on modern systems, too!

B.



Re: URL lengths

2017-11-30 Thread Tim Hill
In article , Harriet Bazley
 wrote:
> On 28 Nov 2017 as I do recall, Daniel Silverstone  wrote:

> > On Mon, Nov 27, 2017 at 18:08:46 +, John Williams wrote:
> > > What is the maximum URL length (including POST data) that NetSurf
> > > can handle?
> >
> > I don't believe we limit URL length per-se, though they get interned
> > and as such four gigabytes per URL is probably the absolute limit. 
> > In addition, POST data is nominally unlimited though I believe we
> > have a similar four gigabyte limit.

> I had an error today from Netsurf, reporting that a URL was too long to
> display (although it seemed to work).

I just can't believe that a URL can contain 33% more letters than the
King James Bible's 3,116,480.

"four gigabytes per URL"

LOL

-- 

Tim Hill

timil.com : tjrh.eu : butterwick.eu : blue-bike.uk : youngtheatre.co.uk



Re: URL lengths

2017-11-29 Thread Harriet Bazley
On 28 Nov 2017 as I do recall,
  Daniel Silverstone  wrote:

> On Mon, Nov 27, 2017 at 18:08:46 +, John Williams wrote:
> > What is the maximum URL length (including POST data) that NetSurf can
> > handle?
>
> I don't believe we limit URL length per-se, though they get interned and as
> such four gigabytes per URL is probably the absolute limit.  In addition, POST
> data is nominally unlimited though I believe we have a similar four gigabyte
> limit.

I had an error today from Netsurf, reporting that a URL was too long to
display (although it seemed to work).

-- 
Harriet Bazley ==  Loyaulte me lie ==

No man has a right to live - but every man has a duty to save him if he can



Re: URL lengths

2017-11-28 Thread John Williams
In article <20171128115425.22cudniq3zrfba3l@somnambulist.local>,
   Daniel Silverstone  wrote:

> I don't believe we limit URL length per-se, though they get interned and
> as such four gigabytes per URL is probably the absolute limit.  In
> addition, POST data is nominally unlimited though I believe we have a
> similar four gigabyte limit.

Right, so it will be an arbitrary comparatively small allowance I will
make, and then politely refuse to handle anything larger.

Thank you all for your assistance.

John

-- 
| John Williams 
| joh...@ukgateway.net

 Names for Soul Band:- Soul Doubt *



Re: URL lengths

2017-11-28 Thread Daniel Silverstone
On Mon, Nov 27, 2017 at 18:08:46 +, John Williams wrote:
> What is the maximum URL length (including POST data) that NetSurf can
> handle?

I don't believe we limit URL length per-se, though they get interned and as
such four gigabytes per URL is probably the absolute limit.  In addition, POST
data is nominally unlimited though I believe we have a similar four gigabyte
limit.


So if you want to store any possible URL plus POST data you'd need eight
gigabytes per allocation.

D.

-- 
Daniel Silverstone   http://www.netsurf-browser.org/
PGP mail accepted and encouraged.Key Id: 3CCE BABE 206C 3B69



Re: URL lengths

2017-11-28 Thread Rob Kendrick
On Mon, Nov 27, 2017 at 06:08:46PM +, John Williams wrote:
> An genuine real-life number would be better!

>From memory, they are dynamically allocated as needed and can be of
arbitrary length.  This is a lot more efficient than using
statically-sized buffers for the worst common case, especially in a
piece of software that might be storing information on tens of thousands
of URLs (Everything in your history, the links to all the resources on
every page you have open, etc).

My advice is firstly to either measure and then allocate, or read in the
data and reallocate as needed dynamically.  Secondly, use a different
language; BBC Basic really isn't good for this sort of thing :)  (If
using a modern interpreted language like Lua, Python, or even Perl,
asking this question would never have occurred to you.)

B.



Re: URL lengths

2017-11-27 Thread Gavin Wraith
In message <56a1b699dcjoh...@ukgateway.net>
  John Williams  wrote:

> In article <0612b1a156.ga...@wra1th.plus.com>,
>   Gavin Wraith  wrote:

> So you're suggesting that I measure each URL length first, perhaps BGETting
> it until I encounter a terminator, and then DIM a variable accordingly -
> or, actually, a series of concatenating variables as I intend to BPUT them
> later; so I don't really need to have a long variable, just a series of
> suitable GET variables.

Not BGET. You do not need to work byte by byte. Why not GET$?
It is a long time ago since I used BASIC.

Do you know about StrongED scripts? Here is one that replaces the text
in a StrongED window by a list of the URLs occurring in it, ignoring duplicates.

Here it is:

#! lua
local pat, used ="(https?://[^%s\t]+)", { }
for line in io.lines (arg[1]) do
  for url in line:gmatch (pat) do
 if not used[url] then
print (url)
used[url] = true
 end -- if
  end -- for
end -- for

-- 
Gavin Wraith (ga...@wra1th.plus.com)
Home page: http://www.wra1th.plus.com/



Re: URL lengths

2017-11-27 Thread Steve Fryatt
On 27 Nov, John Williams wrote in message
<56a1b699dcjoh...@ukgateway.net>:

> In article <0612b1a156.ga...@wra1th.plus.com>,
>Gavin Wraith  wrote:
>
> > I would DIM buffers for URLs from the heap as you need them.
> 
> So you're suggesting that I measure each URL length first, perhaps
> BGETting it until I encounter a terminator, and then DIM a variable
> accordingly - or, actually, a series of concatenating variables as I
> intend to BPUT them later; so I don't really need to have a long variable,
> just a series of suitable GET variables.

Allocating the necessary memory is the correct way to do it, yes.

> A number would be easier!  How many?

As long as the URL is? My memory is that the RISC OS GUI might apply some
limits, but that the core probably doesn't.

-- 
Steve Fryatt - Leeds, England

http://www.stevefryatt.org.uk/



Re: URL lengths

2017-11-27 Thread John Williams
In article <0612b1a156.ga...@wra1th.plus.com>,
   Gavin Wraith  wrote:

> From a quick glance at the NetSurf 3.7 sources I would guess that the
> answer rather depends on which platform, and then on the particular
> machine NetSurf is running on.

I am, of course, running RISC OS.

> If your menu program is in BBC BASIC

which it is

> I would DIM buffers for URLs from the heap as you need them.

So you're suggesting that I measure each URL length first, perhaps BGETting
it until I encounter a terminator, and then DIM a variable accordingly -
or, actually, a series of concatenating variables as I intend to BPUT them
later; so I don't really need to have a long variable, just a series of
suitable GET variables.

A number would be easier!  How many?

John

-- 
| John Williams 
| joh...@ukgateway.net

 Does 'expostulation' refer to the antics of former nuns? *



Re: URL lengths

2017-11-27 Thread Gavin Wraith
In message <56a1aa7e00joh...@ukgateway.net>
  John Williams  wrote:

> What is the maximum URL length (including POST data) that NetSurf can
> handle?

>From a quick glance at the NetSurf 3.7 sources I would guess that the
answer rather depends on which platform, and then on the particular machine
NetSurf is running on. If your menu program is in BBC BASIC I would
DIM buffers for URLs from the heap as you need them. In C I would use alloc.
In Lua, Python or PERL all that is done for you anyway, and you do not
need to think about maximum string lengths.

-- 
Gavin Wraith (ga...@wra1th.plus.com)
Home page: http://www.wra1th.plus.com/



URL lengths

2017-11-27 Thread John Williams

I'm writing a little menu program to generate URL index pages.  It's to
make URLs easily available to my Linux machine if there are any problems
with a page/site under NetSurf.

What is the maximum URL length (including POST data) that NetSurf can
handle?

My program is taking text URLs, Ant URL files or Acorn URI files and
parsing them (in the latter case!) to extract the URL string for use in the
menu page, but I need to know the maximum length NetSurf can handle so that
I can provide for all possibilities that it may throw my way.

Or I could just choose an arbitrary large maximum - but what would a
sensible limit be before complaining? 500 characters /seems/ reasonable,
but is it actually?

An genuine real-life number would be better!

John

-- 
| John Williams 
| joh...@ukgateway.net

 Names for Soul Band:- The Soul Criterion *