Re: [PHP] Detecting Binaries

2004-02-23 Thread Axel IS Main
Guys, this isn't THAT stupid of a question is it? From my perspective, the way PHP seems to see it is that I should already know what kind of file I'm looking at. In most cases that's not an unreasonable assumption. Unfortunately, that's only good for most cases. PHP is rich in ways to work

Re: [PHP] Detecting Binaries

2004-02-23 Thread Adam Voigt
Couldn't you just check the extension on the file? On Mon, 2004-02-23 at 14:03, Axel IS Main wrote: Guys, this isn't THAT stupid of a question is it? From my perspective, the way PHP seems to see it is that I should already know what kind of file I'm looking at. In most cases that's not an

Re: [PHP] Detecting Binaries

2004-02-23 Thread Axel IS Main
Yes, and in fact that is what I am doing now. This is a spider bot though, so I'm having to think of every single type of binary file that could be linked to on the web. So far I'm up to 28 with no end in sight. What about a .com file? I can't omit links that end in .com can I? That would be

Re: [PHP] Detecting Binaries

2004-02-23 Thread Jas
Well you can do a check on the mime type of the file. eg. $mimes = array(1 = application/octet-stream, 2: = image/jpeg, etc. For more info... http://us4.php.net/manual/en/ref.filesystem.php Just like the upload file function you can check for the mime types...

Re[2]: [PHP] Detecting Binaries

2004-02-23 Thread Richard Davey
Hello Axel, Monday, February 23, 2004, 7:03:38 PM, you wrote: AIM Guys, this isn't THAT stupid of a question is it? From my perspective, AIM the way PHP seems to see it is that I should already know what kind of AIM file I'm looking at. In most cases that's not an unreasonable AIM assumption.

Re: [PHP] Detecting Binaries

2004-02-23 Thread Adam Voigt
Well actually to check .com, just make sure it contains a / then the .com, that will filter yahoo.com, but keep yahoo.com/downloadme.com On Mon, 2004-02-23 at 14:19, Axel IS Main wrote: Yes, and in fact that is what I am doing now. This is a spider bot though, so I'm having to think of every

Re: [PHP] Detecting Binaries

2004-02-23 Thread Adam Bregenzer
On Mon, 2004-02-23 at 14:19, Axel IS Main wrote: Yes, and in fact that is what I am doing now. This is a spider bot though, so I'm having to think of every single type of binary file that could be linked to on the web. So far I'm up to 28 with no end in sight. What about a .com file? I

Re: [PHP] Detecting Binaries

2004-02-23 Thread Marek Kilimajer
Generally, binaries have \0 in them, but it is not necessery. Axel IS Main wrote: Guys, this isn't THAT stupid of a question is it? From my perspective, the way PHP seems to see it is that I should already know what kind of file I'm looking at. In most cases that's not an unreasonable

Re[2]: [PHP] Detecting Binaries

2004-02-23 Thread Richard Davey
Hello Axel, Monday, February 23, 2004, 7:38:25 PM, you wrote: AIM Thanks, you just gave me the solution, I think. I don't have to strip AIM out every character above standard ascii, I just have to look for them. AIM If one is there, then just get rid of it. It's true that an OS can't AIM tell

Re: [PHP] Detecting Binaries

2004-02-23 Thread Axel IS Main
Thanks, that's very helpful. It beats the heck out of doing it the way I've been doing it. Richard Davey wrote: Hello Axel, Monday, February 23, 2004, 7:38:25 PM, you wrote: AIM Thanks, you just gave me the solution, I think. I don't have to strip AIM out every character above standard ascii,

Re: Re[2]: [PHP] Detecting Binaries

2004-02-23 Thread Evan Nemerson
On Monday 23 February 2004 11:55 am, Richard Davey wrote: Hello Axel, Monday, February 23, 2004, 7:38:25 PM, you wrote: AIM Thanks, you just gave me the solution, I think. I don't have to strip AIM out every character above standard ascii, I just have to look for them. AIM If one is there,

Re[4]: [PHP] Detecting Binaries

2004-02-23 Thread Richard Davey
Hello Evan, Monday, February 23, 2004, 8:57:43 PM, you wrote: It would be wise to check for characters from 0 to 31, if they appear then it's almost certainly (but not guaranteed) binary. EN Assuming that's decimal, you're including 0x09 0x0a and 0x0d which are, EN respectively, tab, line

Re: [PHP] Detecting Binaries

2004-02-23 Thread Axel IS Main
That's not bad, but I found a way to do it simply using chr() and passing it a value. It turns out the if I go 0-31 Almost nothing will get through. Even the simples html has something in there from that list. However, by just looking between 14 and 26, one more than carriage return, and one

Re: [PHP] Detecting Binaries

2004-02-23 Thread Evan Nemerson
On Monday 23 February 2004 03:02 pm, Axel IS Main wrote: That's not bad, but I found a way to do it simply using chr() and passing it a value. It turns out the if I go 0-31 Almost nothing will get through. Even the simples html has something in there from that list. However, by just looking

Re: Re[4]: [PHP] Detecting Binaries

2004-02-23 Thread Lucas Gonze
Alternatively, count unigrams in the first 1000 characters and get the euclidean distance to a sample from e.g. an english text, a french text, a chinese text, etc. - Lucas -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] Detecting Binaries

2004-02-23 Thread Shane Nelson
Richard Davey wrote: Hello Axel, Monday, February 23, 2004, 7:03:38 PM, you wrote: AIM Guys, this isn't THAT stupid of a question is it? From my perspective, AIM the way PHP seems to see it is that I should already know what kind of AIM file I'm looking at. In most cases that's not an

[PHP] Detecting Binaries

2004-02-22 Thread Axel IS Main
I'm using file_get_contents() to open URLs. Does anyone know if there is a way to look at the result and determine if the file is binary? I'd like to be able to block binaries from being processed without having to try to think of all the possible binary extensions and omit them with a