php-general Digest 16 Sep 2006 22:26:49 -0000 Issue 4351
Topics (messages 241859 through 241866):
Re: Please tell me I dont know regex
241859 by: Tom Atkinson
Re: php and session issues continued...
241860 by: Tom Atkinson
how to get page count when uploading files
241861 by: Jian Fu
241863 by: Stefan van der Linden
Re: Odd PHP memory issue
241862 by: Matthew H. North
getting base domain and sub domains from url?
241864 by: Kenneth Andresen
241865 by: Stut
241866 by: Kenneth Andresen
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
php-general@lists.php.net
----------------------------------------------------------------------
--- Begin Message ---
See http://uk.php.net/manual/en/function.eregi.php#57824
Basically you don't need to escape the dash, just place it somewhere
where it cannot be interpreted as indicating a range. It will then be
treated literally.
They do work the same using preg_match() which requires escaping.
william(at)elan.net wrote:
Please try below page/program on your system entering "ab-cd" and
please tell me I dont know regex - because the way I see it the
results of those tests should have been the same...
-----------------------------------------------------------------------
<html><body><form method=post action=<?php $PHP_SELF ?>
Please enter something with a dash here: <input type='text'
name='regex_test' />
<input type='submit' value='Submit' />
</form>
<?php
if (isset($_REQUEST['regex_test'])) {
print "Testing ".$_REQUEST['regex_test'].' with regex
/^[A-Za-z0-9\._\-]{3,63}$/ ... :';
if (ereg("^[A-Za-z0-9\._\-]{3,63}$",$_REQUEST['regex_test'])) print "
Ok";
else print " Nak";
print "<br /><br />";
print "Testing ".$_REQUEST['regex_test'].' with regex
/^[A-Za-z0-9\.\-_]{3,63}$/ ... :';
if (ereg("^[A-Za-z0-9\.\-_]{3,63}$",$_REQUEST['regex_test'])) print "
Ok";
else print " Nak";
}
print "<br /><br />Current PHP Version is: ".phpversion();
?>
</body></html>
--- End Message ---
--- Begin Message ---
How are you setting the location?
If the user starts at www.yoursite.com and you redirect to yoursite.com
after the first form then you'll lose the session since it's a different
domain.
Dave Goodchild wrote:
Hi all. I have a session issue and wondered if anyone else has encountered
this:
I have an app where a user fills out 3 forms. In each case, after
validation
and cleaning the form data is passed into session variables and after the
final submission the data is entered into the database. I am using both
Firefox and IE6 on Windows XP and the system works like a dream. However, a
user testing on IE6/XP is having some problems. The values from the first
form are not being passed into the session.
When each form is successfully processed the user is redirected to the next
stage using header('Location...'). I call session_write_close before
that to
ensure the session data is written out before the redirect, but the problem
persists.
Anyone recognise this issue. It is not IE-specific as it happens to me on
Firefox intermittently. I am at my wits end! Thanks in advance for any
help!
--- End Message ---
--- Begin Message ---
I really need help and after going through the help page, I don't know where
I can post my question.
When I upload a file (word or pdf), how can I know the page counts of that
file immediately?
Thank you, Jian
--- End Message ---
--- Begin Message ---
I really need help and after going through the help page, I don't know where
I can post my question.
When I upload a file (word or pdf), how can I know the page counts of that
file immediately?
Thank you, Jian
There's no function in the PDF library to READ the file.
And about Word documents (.doc): Word calculates the amount of pages at
'parsetime'. Word documents are just RTF files, and they
don't describe the amount of pages in the files.
So it's really tricky to make a script that counts the pages in Word documents, as you would have to make an RTF parser. :/
--- End Message ---
--- Begin Message ---
On 9/15/06, Richard Lynch <[EMAIL PROTECTED]> wrote:
> On Fri, September 15, 2006 10:42 am, Matthew H. North wrote:
> > We're developing a web application that involves traversal of a
> > hierarchical database structure (MySQL, PEAR::DB, and
> > PEAR::DB::DataObject). Currently that traversal is done recursively,
> > and involves visiting thousands of nodes in the tree. However, the
> > tree is relatively flat, and the recursion never gets more than 4 or 5
> > calls deep. A severely truncated but illustrative version of the code
> > of interest is:
>
> So you are just visiting the nodes, and not doing anything with them?
We're appending certain fields to put together a total result. However, as
mentioned, the amount of data collected is not anywhere near even 1MB, and in
any event, all references to the result variable are unset (AFAIK) when I
come up w/ the final 5.5MB number.
>
> It's entirely possible that PEAR::DB and/or DataObject are trying to
> cache something to "help" you...
>
> You should be able to quickly hack a *BAD* page of code with minimal
> error checking to do whatever queries PEAR::DB is doing for you.
Yeah -- I was hoping to expand my understanding of PHP internals so I could
avoid doing this, but based on more of your comments below I think I'm out of
luck.
I did note that DataObject was keeping a running cache of result sets (even
ones that I was done with for some reason), so I added a destructor on my
DataObject extending classes that cleans those up. That helps keep the
running peak mem down, but is not part of the 5.5MB that I can't get rid of.
>
> > trigger_error(memory_get_usage());
> > $result = traverse_hierarchy();
>
> At crucial points within the hierarchy, perhaps at nodes/leaves you
> expect to be at specific milestones (halfway, 25%, 75%, ...) start
> adding code that does crude things like:
>
> if ($node->name == 'This one node we think is halfway through')
> trigger_error(memory_get_usage());
>
> Log the numbers into a db with the node names and then later graph it
> to see if the memory is getting chewed up in a straight line or if it
> jumps at some point.
>
> If there's a big jump somewhere, you know where to look.
>
> If it's a straight line, then you can start doing the same thing line
> by line to find where the RAM is going.
Great idea -- I just completed a degree in computational physics, which
included courses that involved dumping loads of data and graphing them using
tools like gnuplot and OpenDx... you'd think I would have thought of this one
myself (rolling my eyes).
I _had_ thought of dumping the state at various, noted points throughout the
process, but was hoping to avoid doing this kind of lengthy analysis.
>
> Did you close down the DB connection and kill the PEAR objects?...
>
> PHP's garbage collection has had... issues... in the past.
All of my classes that extend DataObject inherit a common destructor that
calls the DataObject::free() method, which, supposedly, frees result
resources. I'm not sure what you mean by 'kill the PEAR objects', but all
references go out of scope or are unset.
>
> > The question is this: Given the following assumptions:
> >
> > 1) PHP's memory manager reclaims memory when all references to that
> > memory are
> > gone.
>
> Well, it tries to anyway...
>
> It's not always that simple, particular with variable variables and
> other dynamic features.
>
> > 2) A reference is 'gone' when it goes out of scope or is 'unset'.
>
> Scope seems like it should be simple, but it's not.
>
> Use unset to be certain.
>
> > 3) The only references that remain in the global context are
> > references to globals (all non-global variables have gone out of scope
> > and that memory reclaimed)
>
> See #1.
>
> PHP "scope" is not as clean-cut as C.
>
> A simple "for" loop in PHP leaves the iterator variable, last I checked.
>
> Inside a function, that should go out of scope. Outside a function,
> it stays around. foreach, I think, correctly un-scopes the vars.
This collection of statements is very illuminating. Part of my goal in
posting this question was to find out more about how PHP internals work, esp.
wrt GC. Sounds like there really isn't any hope of getting a solid set of
rules that I can follow, and I have to allow for a little slop.
Good -- that just means I can stop agonizing over this issue and 'deal with
it'.
>
> > 5) By doing unset($GLOBALS[$varname]) and unset($$varname), where
> > $varname
> > is
> > each key of the $GLOBALS array, I am effectively eliminating all
> > remaining
> > references, and all allocated memory should be reclaimed by the memory
> > manager (except perhaps for memory associated with function and class
> > definitions).
>
> No.
>
> Dangling pointers and references not correctly cleaned up from a
> function are left out in limbo.
This I find VERY odd. So if I don't unset all references in a function before
it exits I lose that memory?
>
> > 6) Resources (think database resources) are automatically freed by
> > garbage collection when there are no more references to them
>
> Probably, eventually, if PHP's GC kicks in when you think it does.
>
> To be certain, close the DB references when you are done with them.
>
> > 7) No additional code is being evaluated within traverse_hierarchy
> > 8) I'm correct that there aren't any circular references in my code
> > nor in any PEAR module code
>
> Circular references are not a problem, really.
>
> It's the ones that get chopped off from any connection to anything you
> can get ahold of and start releasing that matter.
>
> And if you get a whole big chain of them, with no root to tie onto to
> start releasing...
Yes, and this is exactly what I'm worried about. I didn't develop the code
with this in mind, so I'm worried I may have unwittingly done exactly this.
For example, the traversal code keeps a doubly-linked list of nodes
representing the current branch of the tree. If I was losing the head of
that list I'd be causing this problem, and doing it perhaps thousands of
times.
However, as I mentioned, I've reviewed my code and DataObject and have found
no 'lost' circular references, so I don't think that's the problem (and thus
assumption #8).
>
> > Are there any other ways that user code can result in this apparent
> > memory leak situation? If so, what are they?
>
> PHP Extensions can have a memory leak. Ain't much PHP can do about
> that, really.
Good point.
>
> > Or, are any of my first 6 assumptions incorrect?
>
> They're a little too optimistic. :-)
Perfect -- this is exactly what I was trying to get at. I wasn't even sure
whether I should be totally confident that my assumptions about PHP internals
were that solid, and therefore, whether I should _really_ be banging my head
too hard on this.
Thanks for your detailed response -- you answered my questions and then some.
Is there a resource out there that explains PHP internals in detail?
- Matt
--
Matthew H. North
mailto:[EMAIL PROTECTED]
--- End Message ---
--- Begin Message ---
Hello all,
I am trying to extract base domains and sub domains from url's, and
expect there to exist something to do this already.
I used the parse_url($url) to get the host variable.
My thought is to use $domain_elements[]=array_reverse(explode('.',$url));
then simply check $domain_element[0] against some base of
international+country specific extensions, if they have sub-extensions
like "co.uk" then add
a third level to the domain list , and set this up as base domain. Any
levels beyond the registrable domain names would count as sub domains.
What I would like to know is if there are some lists of all these base
domains, or maybe some function already doing what I would like to do?
In advance, thanks!
--- End Message ---
--- Begin Message ---
Kenneth Andresen wrote:
> What I would like to know is if there are some lists of all these base
> domains, or maybe some function already doing what I would like to do?
A full list of gTLDs and ccTLDs can be found here:
http://www.iana.org/domain-names.htm
-Stut
--- End Message ---
--- Begin Message ---
Thank you, Stut,
It gives the top level ones, but I can't seem to find the lower
specifications covering such as .co.uk .com.mx etc.
I am also starting to realize some countries may two levels of top
domains - in Norway for example you may get a .no domain, but there may
also be lower level top domains such as .mil.no.
I am not sure if there exists any general list for what I originally
looked for, but then again, then again, I am just realizing the specs I
looked at in fact does not require it this specific after all.
If there should be such a list somewhere I am still interested, but the
link you gave me is good enough for what I needed it for.
Again thank you!
Stut wrote:
> Kenneth Andresen wrote:
>
>> What I would like to know is if there are some lists of all these base
>> domains, or maybe some function already doing what I would like to do?
>>
>
> A full list of gTLDs and ccTLDs can be found here:
> http://www.iana.org/domain-names.htm
>
> -Stut
>
>
>
--- End Message ---