Re: saving web page body to file... help needed

2007-01-17 Thread Siddhesh PaiRaikar


thanks for the reply. i understand the complexity involved in working with 
squid as it is a huge code base. im sorry to probe too much into this as well. 
and we are currently working on the squid 2.6 stable 4


> But Squid doesn't. It sends data to the client while it arrives from the
>server.


ok from this.. i just need to know 1 thing then doesnt squid temporarily buffer the body at all before it sends it to the client ? coz if we have to study the source code then we need to know at least whether it temporarily buffers the body content before sending it to the client or if there is any other way. and if it does buffer and if we could know about the data structure.. it will save us a lot of time and help us work on the content filtering part instead of getting involved wiith the intricacies of squid. thanks. 




and focus on implementing your filter
not having to worry about the details of what goes on within Squid. This
will be your quickest path.



our aim also was just to focus on the filter at the moment before getting into embedding the code anywhere. hence we donot want to get inside the code of squid and hence the post. it would really be of great help of we knew the where the body is buffered, along with the data structure if possible. if it is not at all buffered then we will have to think of alternate sources an wont go into the code. 



thanks.  


-Siddhesh Pai Raikar


Re: I want to subscribe

2007-01-17 Thread Adrian Chadd
On Thu, Jan 11, 2007, Marin Stavrev wrote:
> Hello,
> 
> my name is Marin Stavrev and I've developed a handy patch for squid about
> marking (via TOS or an IP option) traffic originating from local squid
> engine's cache and/or neighbouring peers. The idea is to assist a QOS
> solution that is able to detect how the proxied content has been served -
> from cache (expensive) or directly from peering server (cheap). The name of
> the patch is ZPH (Zero Penalty Hit). And so far I've hosted it on a server
> that is no longer going to be available (www.it-academy.bg/zph). I've moved
> the content to another hosting site (zph.bratcheda.org) and I wanted
> to post the announcement in
> the developer's list, as many people here have had interest about it.
> 

I've taken a look at the work. Its pretty straightforward.
The latest linux kernel patch link however doesn't work - to 2.6.5.15.
Could you please fix that up?

Thanks,



Adrian



Re: saving web page body to file... help needed

2007-01-17 Thread Henrik Nordstrom
ons 2007-01-17 klockan 21:59 +0530 skrev Siddhesh PaiRaikar:

> so we currently needed to know the source file in which squid takes
> the body content of a web page from the web and the function
> containing and the name of the data structure it temporarily stores it
> in before storing it in the cache. we can then store it in a file
> there itself and use it as required.

But Squid doesn't. It sends data to the client while it arrives from the
server.

> studying the source code is taking a very long time and we are running
> a time constraint.. so if we could please get some help on the
> source file, the function name and the data structure name which
> stores the body it would be great.

If you want to do this in Squid you will need to study quite a bit of
source code I am afraid. Especially if you are looking at the Squid-2.x
code base. Squid-3 is a bit easier with it's client-streams interface,
but documentation pretty thin on how to use them.

So grab a Squid-3 snapshot
, C-ICAP
, and focus on implementing your filter
not having to worry about the details of what goes on within Squid. This
will be your quickest path.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: saving web page body to file... help needed

2007-01-17 Thread Siddhesh PaiRaikar

thank you for your quick reply we will try to use the ICAP server
as u suggested..

but currently we are concentrating only on the filtering part before
we get to embedding the code.

once that is completed we will extend it to any server.. and rite now
we have squid installed on our work machines.

so we currently needed to know the source file in which squid takes
the body content of a web page from the web and the function
containing and the name of the data structure it temporarily stores it
in before storing it in the cache. we can then store it in a file
there itself and use it as required.

studying the source code is taking a very long time and we are running
a time constraint.. so if we could please get some help on the
source file, the function name and the data structure name which
stores the body it would be great.

thanks a lot

Siddhesh Pai Raikar


On 1/17/07, Alex Rousskov <[EMAIL PROTECTED]> wrote:

On Wed, 2007-01-17 at 19:07 +0530, Siddhesh PaiRaikar wrote:

> we are trying to develop a small enhancement to the existing application
of
> squidguard using the squid proxy server... which can later be embedded in
to
> squid itself as a html web page body scanner for unwanted content.

Please consider using an ICAP (RFC 3507, i-cap.org) server for content
scanning, blocking, and manipulation instead of building this feature
directly into Squid. To implement your functionality, you can modify one
of the free or for-a-fee ICAP servers available on the web.

Besides having to work with a much simpler and smaller code base, you
will have an advantage of being compatible with other popular proxies
because they all speak ICAP.

Good luck,

Alex.





Re: saving web page body to file... help needed

2007-01-17 Thread Alex Rousskov
On Wed, 2007-01-17 at 19:07 +0530, Siddhesh PaiRaikar wrote:

> we are trying to develop a small enhancement to the existing application of
> squidguard using the squid proxy server... which can later be embedded in to
> squid itself as a html web page body scanner for unwanted content.

Please consider using an ICAP (RFC 3507, i-cap.org) server for content
scanning, blocking, and manipulation instead of building this feature
directly into Squid. To implement your functionality, you can modify one
of the free or for-a-fee ICAP servers available on the web.

Besides having to work with a much simpler and smaller code base, you
will have an advantage of being compatible with other popular proxies
because they all speak ICAP.

Good luck,

Alex.




saving web page body to file... help needed

2007-01-17 Thread Siddhesh PaiRaikar

hi..


we are trying to develop a small enhancement to the existing application of
squidguard using the squid proxy server... which can later be embedded in to
squid itself as a html web page body scanner for unwanted content.

the idea is to extract the body of the HTML page and store it in a file..
and then perform all the working i.e. use desired search techniques and
allow/disallow the page based on the content.

for that we needed to save the body content of every HTML page received in a
temporarily created file and then work on the file as desired.
as we are running a little time constraint here.. we are not able to scan
the entire code and see where exactly squid takes in the web page from the
internet and temporaily stopes it before displaying...

i mean .. pretty obviously the server must be doing that... we tried for a
long time to fing out from the source files but to no avail.
we are novices to this field of development and hence would like assistance.

if we can just know where exactly is squid taking the web page body of the
page and storing it before sending it to the browser for display...
i.e. the exact source file name, the function name and the variable in the
souce code of squid in which the file is stored.

eagerly awaiting a response...

thank you.

-Siddhesh Pai Raikar
(India)

P.S. I really did not know who this was to be sent to and i tried finding
out without much success.. so plz forgive me if you are not the one meant to
receive this.