Re: [squid-users] Caching http google deb files

2016-10-23 Thread garryd

On 2016-10-22 23:18, Heiler Bemerguy wrote:

I've never used ICAP, and I think hacking the code is way faster than
creating/using a separate service for that. And I'm not sure, but I
don't think I can manage to get this done with current squid's
options.


Hi,

For this case I also suggest to use content adaptation, especially eCAP, 
for the following reasons:


* ACL can be used to steer only abusing replies to eCAP to mangle Vary 
field

* There is no need to apply local patch to new Squid versions
* There is no need to build Squid from sources
* There is no need to use daemons for content adaptation
* There is a sample adapter 'ecap_adapter_modifying' [1], prepared by 
The Measurement Factory (Many thanks!), which successfully modifies HTTP 
message's body. It can be to modified to mangle HTTP headers.


[1] http://e-cap.org/Documentation

Garri
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-22 Thread Eliezer Croitoru
Well you are right about that but for me it’s simpler to Write and ICAP
service to do that then hack squid code.

Eliezer


Eliezer Croitoru <http://ngtech.co.il/lmgtfy/> 
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il
 

From: Heiler Bemerguy [mailto:heiler.bemer...@cinbesa.com.br] 
Sent: Saturday, October 22, 2016 21:18
To: Eliezer Croitoru <elie...@ngtech.co.il>
Cc: squid-us...@squid-cache.org
Subject: Re: [squid-users] Caching http google deb files


Hi Eliezer
I've never used ICAP, and I think hacking the code is way faster than
creating/using a separate service for that. And I'm not sure, but I don't
think I can manage to get this done with current squid's options.
This patch will make squid NOT ignore the objects with "Vary: *" replies.
It will consider them a valid and cacheable object. 
And it will only consider as a valid Vary option those who begins
with"accept" or the "user-agent" one.

-- 
Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751
Em 21/10/2016 16:07, Eliezer Croitoru escreveu:
Instead of modifying the code, would you consider to use an ICAP service
that will mangle this?
I am unsure about the risks about doing so but why patch the sources if you
can resolve it with the current mainstream capabilities and API?

Eliezer


Eliezer Croitoru <http://ngtech.co.il/lmgtfy/>
<http://ngtech.co.il/lmgtfy/>  
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il <mailto:elie...@ngtech.co.il> 
 

From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On
Behalf Of Heiler Bemerguy
Sent: Friday, October 21, 2016 18:21
To: squid-us...@squid-cache.org <mailto:squid-us...@squid-cache.org> 
Subject: Re: [squid-users] Caching http google deb files


Hello,
I've limited the "vary" usage and gained some hits by making these
modifications (in blue) to the http.cc code:
while (strListGetItem(, ',', , , )) {
SBuf name(item, ilen);
if (name == asterisk) {
/*  vstr.clear();
break; */ 
continue;
}
name.toLower();

   if (name.cmp("accept", 6) != 0 &&
  name.cmp("user-agent", 10) != 0)
   continue;

if (!vstr.isEmpty())
vstr.append(", ", 2);




<>___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-22 Thread Heiler Bemerguy


Hi Eliezer

I've never used ICAP, and I think hacking the code is way faster than 
creating/using a separate service for that. And I'm not sure, but I 
don't think I can manage to get this done with current squid's options.


This patch will make squid NOT ignore the objects with "*Vary: **" 
replies. It will consider them a valid and cacheable object.
And it will only consider as a valid Vary option those who *begins 
*with"accept" or the "user-agent" one.



--
Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751

Em 21/10/2016 16:07, Eliezer Croitoru escreveu:

Instead of modifying the code, would you consider to use an ICAP service
that will mangle this?
I am unsure about the risks about doing so but why patch the sources if you
can resolve it with the current mainstream capabilities and API?

Eliezer


Eliezer Croitoru <http://ngtech.co.il/lmgtfy/>
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il
  


From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On
Behalf Of Heiler Bemerguy
Sent: Friday, October 21, 2016 18:21
To: squid-us...@squid-cache.org
Subject: Re: [squid-users] Caching http google deb files


Hello,
I've limited the "vary" usage and gained some hits by making these
modifications (in blue) to the http.cc code:
 while (strListGetItem(, ',', , , )) {
 SBuf name(item, ilen);
 if (name == asterisk) {
 /*  vstr.clear();
 break; */
 continue;
 }
 name.toLower();

if (name.cmp("accept", 6) != 0 &&
   name.cmp("user-agent", 10) != 0)
continue;

 if (!vstr.isEmpty())
 vstr.append(", ", 2);





___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-21 Thread Yuri Voinov

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
 
But I think it will be quite sufficient to bring back one of the options
HTTP violations, namely - "Ignore cache-control".

That's all. The rest we do ourselves.

22.10.2016 1:28, Yuri Voinov пишет:
>
> This is inappropriate. Just all we are need that to make the option
"F*ck the RFC and f*ck anyone who opposes caching" in the SQUID.
>
>
> 22.10.2016 1:07, Eliezer Croitoru пишет:
> > Instead of modifying the code,
>   would you consider to use an ICAP service
>
>   > that will mangle this?
>
>   > I am unsure about the risks about doing so but why patch the
>   sources if you
>
>   > can resolve it with the current mainstream capabilities and
>   API?
>
>
>
>   > Eliezer
>
>
>
>   > 
>
>   > Eliezer Croitoru <http://ngtech.co.il/lmgtfy/>
>
>   > Linux System Administrator
>
>   > Mobile: +972-5-28704261
>
>   > Email: elie...@ngtech.co.il
>
>
>
>
>
>   > From: squid-users
>   [mailto:squid-users-boun...@lists.squid-cache.org] On
>
>   > Behalf Of Heiler Bemerguy
>
>   > Sent: Friday, October 21, 2016 18:21
>
>   > To: squid-us...@squid-cache.org
>
>   > Subject: Re: [squid-users] Caching http google deb files
>
>
>
>
>
>   > Hello,
>
>   > I've limited the "vary" usage and gained some hits by making
>   these
>
>   > modifications (in blue) to the http.cc code:
>
>   > while (strListGetItem(, ',', ,
>   , )) {
>
>   > SBuf name(item, ilen);
>
>   > if (name == asterisk) {
>
>   > /*  vstr.clear();   
>
>   > break; */
>
>   > continue;
>
>   > }
>
>   > name.toLower();
>
>
>
>   >if (name.cmp("accept", 6) != 0 &&
>
>   >   name.cmp("user-agent", 10) != 0)
>
>   >continue;
>
>
>
>   > if (!vstr.isEmpty())
>
>   > vstr.append(", ", 2);
>
>
>
>
>
>
>
>
>
>
>
>   > ___
>
>   > squid-users mailing list
>
>   > squid-users@lists.squid-cache.org
>
>   > http://lists.squid-cache.org/listinfo/squid-users
>
>

-BEGIN PGP SIGNATURE-
Version: GnuPG v2
 
iQEcBAEBCAAGBQJYCnA/AAoJENNXIZxhPexGguEIAMKABEEHBNSO1+9ySQJJuEF1
spOET1zeDEEWvkVPAAfiFZK/13hG3xDoAA1hX8gtyChFTgDszBJvlvNAI/4UKKKY
uHZLm2HFxng1xTlyx0SD0CR00sXgk8mLQFl+JDeaZs8y7tQV+MNlZ+vJa9ox+xOO
ZbCudWthgC6Jl4scYI82fmfPEk3GdTaMyqfzF23iNglQ+CxHWnhoKspsau6b244j
yqSCjEsR7czZn7HEEeUkX0/7pYruz0+m8qleFgz1HW/Zcs5/i5M/szrow9BhUKqo
GF9IViquTfJJSthYC0XF5oC3NTOY0rsRaQDoCA0KNgcG7fn8ddt74bIcA4j7mLk=
=p36C
-END PGP SIGNATURE-



0x613DEC46.asc
Description: application/pgp-keys
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-21 Thread Yuri Voinov

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
 
This is inappropriate. Just all we are need that to make the option
"F*ck the RFC and f*ck anyone who opposes caching" in the SQUID.


22.10.2016 1:07, Eliezer Croitoru пишет:
> Instead of modifying the code, would you consider to use an ICAP service
> that will mangle this?
> I am unsure about the risks about doing so but why patch the sources
if you
> can resolve it with the current mainstream capabilities and API?
>
> Eliezer
>
> 
> Eliezer Croitoru <http://ngtech.co.il/lmgtfy/>
> Linux System Administrator
> Mobile: +972-5-28704261
> Email: elie...@ngtech.co.il
> 
>
> From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On
> Behalf Of Heiler Bemerguy
> Sent: Friday, October 21, 2016 18:21
> To: squid-us...@squid-cache.org
> Subject: Re: [squid-users] Caching http google deb files
>
>
> Hello,
> I've limited the "vary" usage and gained some hits by making these
> modifications (in blue) to the http.cc code:
> while (strListGetItem(, ',', , , )) {
> SBuf name(item, ilen);
> if (name == asterisk) {
> /*  vstr.clear();   
> break; */
> continue;
> }
> name.toLower();
>
>if (name.cmp("accept", 6) != 0 &&
>   name.cmp("user-agent", 10) != 0)
>continue;
>
> if (!vstr.isEmpty())
> vstr.append(", ", 2);
>
>
>
>
>
> ___
> squid-users mailing list
> squid-users@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users

-BEGIN PGP SIGNATURE-
Version: GnuPG v2
 
iQEcBAEBCAAGBQJYCmxVAAoJENNXIZxhPexGj5AH/1GVTNwdJisRXBWSsD190zn5
GoEaYfpYnGczsUh3h3acbzIbeiAZ048NsKxJx/1wfutGeWSt8sLzLNVX+ej02kN1
oLnFh0WSQ4uwzZSrvFIe+j1lxumvugpeoA27wZaVaz4uRP8kDiOvnTnFRjevXSH5
jVHiZkP3BUSElB7Y9p+2GGDyE5AXFIRvF1kJ3GTDqIb90fvpw2K/ES3pKcj7LL8j
xJRgoyFg7b5tn5xnAPFiiJwwT+4fqIAqDgOL0AZpeLuBPqqXP/UGs5d1KnLWESvT
luxn4jxJYCcIqfvYC39PVSoXySBHPq58paQtz9wWLM1faDdHX6I78JVROp/Werk=
=b1xs
-END PGP SIGNATURE-



0x613DEC46.asc
Description: application/pgp-keys
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-21 Thread Eliezer Croitoru
Instead of modifying the code, would you consider to use an ICAP service
that will mangle this?
I am unsure about the risks about doing so but why patch the sources if you
can resolve it with the current mainstream capabilities and API?

Eliezer


Eliezer Croitoru <http://ngtech.co.il/lmgtfy/> 
Linux System Administrator
Mobile: +972-5-28704261
Email: elie...@ngtech.co.il
 

From: squid-users [mailto:squid-users-boun...@lists.squid-cache.org] On
Behalf Of Heiler Bemerguy
Sent: Friday, October 21, 2016 18:21
To: squid-us...@squid-cache.org
Subject: Re: [squid-users] Caching http google deb files


Hello,
I've limited the "vary" usage and gained some hits by making these
modifications (in blue) to the http.cc code:
while (strListGetItem(, ',', , , )) {
SBuf name(item, ilen);
if (name == asterisk) {
/*  vstr.clear();
break; */ 
continue;
}
name.toLower();

   if (name.cmp("accept", 6) != 0 &&
  name.cmp("user-agent", 10) != 0)
   continue;

if (!vstr.isEmpty())
vstr.append(", ", 2);



-- 
Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751

Em 07/10/2016 06:26, Hardik Dangar escreveu:
Hey Alex, 

I totally get that Vary code is different, I have been trying to understand
squid code for last few days although my C, C++ skills are very limited I am
able to understand bits and pieces here and there. I have also read HTTP 1.1
specs for cache ( https://tools.ietf.org/html/rfc2616#section-13
<https://tools.ietf.org/html/rfc2616>  )

After doing fair bit of research I believe we need to two things, 

1) start a campaign to convince webmasters to update their server configs,
for that to happen I am doing my research on apache and Nginx servers on how
to implement ( HTTP 1.1 spec cache guidelines ) and will provide them copy
paste configs for all requests or for the file types like
deb,apk,rpm,exe(binary files),etc...

I am documenting that here,
https://hardikdangar.github.io/thecacheproject/

and once I finish everything I will post everything at the squid-dev list
here so all of you can look at it and if you guys approve it, I will
personally try to contact the big providers and send them above page with
solutions. and will ask community support and will publish it to twitter and
other social sites to get support.

2) I want to build a module which will first handle Vary: * requests and
convert it into Vary: Accept-Encoding or something similar but only for the
ACL's specified by cache administrator.

Next, there are use cases like GitHub which are very difficult to handle
but I feel there is a way we can handle those use cases so I will build ACL
for those.

For this, i am trying to understand squid code, After looking at dev docs,
I understand how the request is handled at clientBeginRequest. But I am very
confused at how squid handles the response.  I know client_side_reply.cc is
the file where the response is handled but I am not sure how
StoreEntry::checkCachable() method in store.cc is called before it as that
is the method i get in squid logs when the cache is denied.

Basically, I need to know how to debug line by line source for squid. Right
now my method of testing involves building squid and adding debug lines and
its very slow process as it takes time every time. Can you help me with this
? is there a way i could send a request directly to squid source file i.e
debug source code line by line ? If so what are the tools required and how
to set it up ?

Again, I am sorry if i am asking too much but my C experience is very
limited and i feel like i am asking very naive questions but these are very
difficult for me at this stage and i really appreciate all of the squid dev
teams who is been answering all of my questions. Thank you very much for
that.


I just want a better cache support for squid and modern day use cases.

On Thu, Oct 6, 2016 at 11:25 PM, Alex Rousskov
<rouss...@measurement-factory.com <mailto:rouss...@measurement-factory.com>
> wrote:
On 10/06/2016 11:14 AM, Linda A. Walsh wrote:
> Alex Rousskov wrote:
>> We can, but ignoring Vary requires more/different work than adding
>> another refresh_pattern option. Vary is not a refresh mechanism so
>> different code areas need to be modified to ignore (but still forward!)
>> Vary.


>I can't say for certain, but I'd give it a 75% shot of it being
> used as a forced-refresh pattern

The [ab]use cases do not matter here -- the _code_ handling Vary is very
different from the code handling refresh logic. That difference is
natural and unavoidable because the two protocol mechanisms are very
different, even if they both can be and are used to create the same effect.

Alex.

___
squ

Re: [squid-users] Caching http google deb files

2016-10-21 Thread Heiler Bemerguy


Hello,

I've limited the "vary" usage and gained some hits by making these 
modifications (in blue) to the http.cc code:


while (strListGetItem(, ',', , , )) {
SBuf name(item, ilen);
if (name == asterisk) {
*/** vstr.clear();
break; **/ *
*continue;*
}
name.toLower();

*   if (name.cmp("accept", 6) != 0 &&**
**  name.cmp("user-agent", 10) != 0)**
**   continue;*

if (!vstr.isEmpty())
vstr.append(", ", 2);



--
Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751



Em 07/10/2016 06:26, Hardik Dangar escreveu:

Hey Alex,

I totally get that Vary code is different, I have been trying to 
understand squid code for last few days although my C, C++ skills are 
very limited I am able to understand bits and pieces here and there. I 
have also read HTTP 1.1 specs for cache ( 
https://tools.ietf.org/html/rfc2616#section-13 )


After doing fair bit of research I believe we need to two things,

1) start a campaign to convince webmasters to update their server 
configs,  for that to happen I am doing my research on apache and 
Nginx servers on how to implement ( HTTP 1.1 spec cache guidelines ) 
and will provide them copy paste configs for all requests or for the 
file types like deb,apk,rpm,exe(binary files),etc...


I am documenting that here,
https://hardikdangar.github.io/thecacheproject/

and once I finish everything I will post everything at the 
squid-dev list here so all of you can look at it and if you guys 
approve it, I will personally try to contact the big providers and 
send them above page with solutions. and will ask community support 
and will publish it to twitter and other social sites to get support.


2) I want to build a module which will first handle Vary: * requests 
and convert it into Vary: Accept-Encoding or something similar but 
only for the ACL's specified by cache administrator.


Next, there are use cases like GitHub which are very difficult to 
handle but I feel there is a way we can handle those use cases so I 
will build ACL for those.


For this, i am trying to understand squid code, After looking at dev 
docs, I understand how the request is handled at clientBeginRequest. 
But I am very confused at how squid handles the response.  I know 
client_side_reply.cc is the file where the response is handled but I 
am not sure how StoreEntry::checkCachable() method in store.cc is 
called before it as that is the method i get in squid logs when the 
cache is denied.


Basically, I need to know how to debug line by line source for squid. 
Right now my method of testing involves building squid and adding 
debug lines and its very slow process as it takes time every time. Can 
you help me with this ? is there a way i could send a request directly 
to squid source file i.e debug source code line by line ? If so what 
are the tools required and how to set it up ?


Again, I am sorry if i am asking too much but my C experience is very 
limited and i feel like i am asking very naive questions but these are 
very difficult for me at this stage and i really appreciate all of the 
squid dev teams who is been answering all of my questions. Thank you 
very much for that.



I just want a better cache support for squid and modern day use cases.

On Thu, Oct 6, 2016 at 11:25 PM, Alex Rousskov 
> wrote:


On 10/06/2016 11:14 AM, Linda A. Walsh wrote:
> Alex Rousskov wrote:
>> We can, but ignoring Vary requires more/different work than adding
>> another refresh_pattern option. Vary is not a refresh mechanism so
>> different code areas need to be modified to ignore (but still
forward!)
>> Vary.


>I can't say for certain, but I'd give it a 75% shot of it being
> used as a forced-refresh pattern

The [ab]use cases do not matter here -- the _code_ handling Vary
is very
different from the code handling refresh logic. That difference is
natural and unavoidable because the two protocol mechanisms are very
different, even if they both can be and are used to create the
same effect.

Alex.

___
squid-users mailing list
squid-users@lists.squid-cache.org

http://lists.squid-cache.org/listinfo/squid-users





___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-07 Thread Alex Rousskov
On 10/07/2016 03:26 AM, Hardik Dangar wrote:
> 2) I want to build a module which will first handle Vary: * requests and
> convert it into Vary: Accept-Encoding or something similar but only for
> the ACL's specified by cache administrator.

If you want to convert/change the Vary response header, you can:

* write an ICAP RESPMOD service
* write an eCAP RESPMOD adapter
* add reply_header_replace code similar to request_header_replace code
  or, better, revamp related directives to make header replacing easier.

See http://wiki.squid-cache.org/SquidFaq/ContentAdaptation

As I tried to indicate earlier, a better solution would be not to
replace the Vary header (because that affects everybody receiving it
after Squid) but to make Vary interpretation configurable while still
forwarding the original Vary header. That requires more development.


If you want to minimize [C++] development, you could use c-icap or a
even a temporary ICAP server script and, instead of replacing Vary, add
an X-Squid-Vary header with the right value. After that, you can modify
Squid to honor X-Squid-Vary (instead of Vary) if it is present. Look for
ACCELERATOR_VARY for similar (but different?) code. Such code may not be
officially accepted, but it can work as a proof of concept.


> Basically, I need to know how to debug line by line source for squid.
> Right now my method of testing involves building squid and adding debug
> lines and its very slow process as it takes time every time. Can you
> help me with this ? is there a way i could send a request directly to
> squid source file i.e debug source code line by line ? If so what are
> the tools required and how to set it up ?
> 
> Again, I am sorry if i am asking too much but my C experience is very
> limited and i feel like i am asking very naive questions but these are
> very difficult for me at this stage and i really appreciate all of the
> squid dev teams who is been answering all of my questions.

Sorry, I personally cannot help you with that endeavor right now. This
may sound harsh, but seeing many folks with a lot more C++ skills fail
before you, I have to recommend staying away from non-trivial code
changes given your current skill level. It is possible to learn [C++ and
proxy] development on-the-fly, but Squid is just the wrong product for
doing that IMHO.


Good luck,

Alex.

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-07 Thread Hardik Dangar
Hey Alex,

I totally get that Vary code is different, I have been trying to understand
squid code for last few days although my C, C++ skills are very limited I
am able to understand bits and pieces here and there. I have also read HTTP
1.1 specs for cache ( https://tools.ietf.org/html/rfc2616#section-13 )

After doing fair bit of research I believe we need to two things,

1) start a campaign to convince webmasters to update their server configs,
 for that to happen I am doing my research on apache and Nginx servers on
how to implement ( HTTP 1.1 spec cache guidelines ) and will provide them
copy paste configs for all requests or for the file types like
deb,apk,rpm,exe(binary files),etc...

I am documenting that here,
https://hardikdangar.github.io/thecacheproject/

and once I finish everything I will post everything at the squid-dev list
here so all of you can look at it and if you guys approve it, I will
personally try to contact the big providers and send them above page with
solutions. and will ask community support and will publish it to twitter
and other social sites to get support.

2) I want to build a module which will first handle Vary: * requests and
convert it into Vary: Accept-Encoding or something similar but only for the
ACL's specified by cache administrator.

Next, there are use cases like GitHub which are very difficult to handle
but I feel there is a way we can handle those use cases so I will build ACL
for those.

For this, i am trying to understand squid code, After looking at dev docs,
I understand how the request is handled at clientBeginRequest. But I am
very confused at how squid handles the response.  I know
client_side_reply.cc is the file where the response is handled but I am not
sure how StoreEntry::checkCachable() method in store.cc is called before it
as that is the method i get in squid logs when the cache is denied.

Basically, I need to know how to debug line by line source for squid. Right
now my method of testing involves building squid and adding debug lines and
its very slow process as it takes time every time. Can you help me with
this ? is there a way i could send a request directly to squid source file
i.e debug source code line by line ? If so what are the tools required and
how to set it up ?

Again, I am sorry if i am asking too much but my C experience is very
limited and i feel like i am asking very naive questions but these are very
difficult for me at this stage and i really appreciate all of the squid
dev teams who is been answering all of my questions. Thank you very much
for that.


I just want a better cache support for squid and modern day use cases.

On Thu, Oct 6, 2016 at 11:25 PM, Alex Rousskov <
rouss...@measurement-factory.com> wrote:

> On 10/06/2016 11:14 AM, Linda A. Walsh wrote:
> > Alex Rousskov wrote:
> >> We can, but ignoring Vary requires more/different work than adding
> >> another refresh_pattern option. Vary is not a refresh mechanism so
> >> different code areas need to be modified to ignore (but still forward!)
> >> Vary.
>
>
> >I can't say for certain, but I'd give it a 75% shot of it being
> > used as a forced-refresh pattern
>
> The [ab]use cases do not matter here -- the _code_ handling Vary is very
> different from the code handling refresh logic. That difference is
> natural and unavoidable because the two protocol mechanisms are very
> different, even if they both can be and are used to create the same effect.
>
> Alex.
>
> ___
> squid-users mailing list
> squid-users@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
>
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-06 Thread Alex Rousskov
On 10/06/2016 11:14 AM, Linda A. Walsh wrote:
> Alex Rousskov wrote:
>> We can, but ignoring Vary requires more/different work than adding
>> another refresh_pattern option. Vary is not a refresh mechanism so
>> different code areas need to be modified to ignore (but still forward!)
>> Vary.


>I can't say for certain, but I'd give it a 75% shot of it being
> used as a forced-refresh pattern

The [ab]use cases do not matter here -- the _code_ handling Vary is very
different from the code handling refresh logic. That difference is
natural and unavoidable because the two protocol mechanisms are very
different, even if they both can be and are used to create the same effect.

Alex.

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-06 Thread Linda A. Walsh

Alex Rousskov wrote:

We can, but ignoring Vary requires more/different work than adding
another refresh_pattern option. Vary is not a refresh mechanism so
different code areas need to be modified to ignore (but still forward!)
Vary.
  


   I can't say for certain, but I'd give it a 75% shot of it being
used as a forced-refresh pattern because more browser-agents (as well
as caching solutions) are ignoring other refresh options to not have
to download unchanging content. 


   Much of google's "ssl-everywhere", IMO, is all about disabling
the ability to have common caches for multiple users so they can
track the multiple downloads and the users.


___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-05 Thread Hardik Dangar
Hey Anthony,

I have used apt-cacher-ng, but it can't save git repos or npm repos. Also i
have used apt-cacher-ng, it used to work great until 12.02 but when we had
started to have mixed setup [ ubuntu 13,14.04 and others ] we got issues
within our setup and one point issues became so daily we decided to scrap
apt-cacher-ng.



On Thu, Oct 6, 2016 at 12:43 AM, Antony Stone <
antony.st...@squid.open.source.it> wrote:

> On Wednesday 05 October 2016 at 20:40:46, Hardik Dangar wrote:
>
> > Hey Jok,
> >
> > Thanks for the suggetion but the big issue with that is i have to
> download
> > whole repository about ( 80-120 GB ) first and then each week i need to
> > download 20 to 25 GB.
>
> This is not true for apt-cacher-ng.  You install it and it does nothing.
> You
> point your Debian (or Ubuntu, maybe other Debian-derived distros as well, I
> haven't tested) machines at it as their APT proxy, and it then caches
> content
> as it gets requested and downloaded.  Each machine which requests a new
> package causes that package to get cached.  Each machine which requests a
> cached package gets the local copy (unless it's been updated, in which case
> the cache gets updated).
>
> > We hardly use any of that except few popular repos.
> > big issue i always have with most of them is third party repo's.
> > squid-deb-proxy is quite reliable but again its squid with custom config
> > nothing else and it fails to cache google debs.
> >
> > Squid is perfect for me because it can cache things which is requested
> > first time. So next time anybody requests it it's ready.
>
> This is exactly how apt-cacher-ng works.  I use it myself and I would
> recommend you investigate it further for this purpose.
>
> > The problem lies when big companies like google and github does not
> wants us
> > to cache their content and puts various tricks so we can't do that.
>
> That's a strange concept for a Debian repository (even third-party).
>
> Are you sure you're talking about repositories and not just isolated .deb
> files?
>
>
> Antony.
>
> --
> A user interface is like a joke.
> If you have to explain it, it didn't work.
>
>Please reply to the
> list;
>  please *don't* CC
> me.
> ___
> squid-users mailing list
> squid-users@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
>
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-05 Thread Hardik Dangar
Hey Amos,

oh, i actually built archive mode squid by getting help at here,
http://bugs.squid-cache.org/show_bug.cgi?id=4604

I was thinking if we have option vary_mode just like archive mode to set it
for particular domain like,

acl dlsslgoogle srcdomain dl-ssl.google.com
vary_mode allow dlsslgoogle
Above could work one of the following way,

1) We replace Vary header for srcdomain to some suitable option so request
can be cached
2) This will remove vary header totally for the above domain.
3) above will use matching squid refresh pattern for srcdomain and only
cache requests for
particular type of file given in refresh_pattern

What do you think would be easiar ? and how do i work on squid source to do
above? any hint is appreciated.


One more thing can you tell me if we are already violating http via options
like nocache, ignore-no-store ignore-private ignore-reload, why can't we do
the same for Vary header ?

It seems servers that are notorious have Vary * header as well as at times
(github) no Last modified header and these are the biggest bandwidth eaters.



Thanks.
Hardik
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-05 Thread Hardik Dangar
Hey Amos,

I have implemented your patch at

and added following to my squid.conf
archive_mode allow all

and my refresh pattern is,
refresh_pattern dl-ssl.google.com/.*\.(deb|zip|tar|rpm) 129600 100% 129600
ignore-reload ignore-no-store override-expire override-lastmod ignor$

but i am still not able to cache it, can you tell from below output what
would be the problem ? Do i need to configure anything extra ?

here is the debug output for the same,


2016/10/05 15:46:25.319 kid1| 5,2| TcpAcceptor.cc(220) doAccept: New
connection on FD 14
2016/10/05 15:46:25.319 kid1| 5,2| TcpAcceptor.cc(295) acceptNext:
connection on local=[::]:3128 remote=[::] FD 14 flags=9
2016/10/05 15:46:25.319 kid1| 11,2| client_side.cc(2346) parseHttpRequest:
HTTP Client local=192.168.1.1:3128 remote=192.168.1.76:51236 FD 12 flags=1
2016/10/05 15:46:25.319 kid1| 11,2| client_side.cc(2347) parseHttpRequest:
HTTP Client REQUEST:
-
GET
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
HTTP/1.1
Host: dl-ssl.google.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101
Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie:
NID=88=109tS20j8Ec0EQb5HzuNnbwtsl4sK64aakVRn-2qOe91Zv4e3st9lfyik8qQe7d12J4xBDCmdKMwiXY98a2dj4mOitaP4AbJV6fD7o9YKTxE7MziEkNCJ45GiDszPM8wXca5cuYK_gE4QVrU52VqzSa1IzmHbh_7XKsvYuDCSsgIMZaC8d4Fp01vrAU8dHPXGopVpBIxgpHwAjPv8NvLFM3e4y-um5A8umQ-GCFmpaaLd1_1jyafkNLTj-9Ix4hfsw;
SID=1ANPj1-lw03bKfunZfrmk8ZsjEcTl5AiLgwzgtzki8MZ3JuvGyYgiP7LRJ05U1HQWbf76g.;
HSID=AUu5M-p2Rw1uDb2_0; APISID=ss4uEw9eIOgmsZXv/ARs9Vws4Es_o_sfVX
Connection: keep-alive
Upgrade-Insecure-Requests: 1


--
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(744)
clientAccessCheckDone: The request GET
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
is ALLOWED; last ACL checked: CONNECT
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(720)
clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(744)
clientAccessCheckDone: The request GET
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
is ALLOWED; last ACL checked: CONNECT
2016/10/05 15:46:25.320 kid1| 17,2| FwdState.cc(133) FwdState: Forwarding
client request local=192.168.1.1:3128 remote=192.168.1.76:51236 FD 12
flags=1, url=
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
2016/10/05 15:46:25.320 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths:
Find IP destination for:
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb'
via dl-ssl.google.com
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(280) peerSelectDnsPaths:
Found sources for '
http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
'
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(281) peerSelectDnsPaths:
  always_direct = ALLOWED
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(282) peerSelectDnsPaths:
   never_direct = DENIED
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:
 DIRECT = local=[::] remote=[2404:6800:4008:c02::be]:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:
 DIRECT = local=0.0.0.0 remote=74.125.23.136:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:
 DIRECT = local=0.0.0.0 remote=74.125.23.93:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:
 DIRECT = local=0.0.0.0 remote=74.125.23.91:80 flags=1
2016/10/05 15:46:25.418 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:
 DIRECT = local=0.0.0.0 remote=74.125.23.190:80 flags=1
2016/10/05 15:46:25.418 kid1| 44,2| peer_select.cc(295) peerSelectDnsPaths:
   timedout = 0
2016/10/05 15:46:25.418 kid1| 14,2| ipcache.cc(924) ipcacheMarkBadAddr:
ipcacheMarkBadAddr: dl-ssl.google.com [2404:6800:4008:c02::be]:80
2016/10/05 15:46:25.567 kid1| 11,2| http.cc(2203) sendRequest: HTTP Server
local=192.168.1.1:36674 remote=74.125.23.136:80 FD 13 flags=1
2016/10/05 15:46:25.567 kid1| 11,2| http.cc(2204) sendRequest: HTTP Server
REQUEST:
-
GET /dl/linux/direct/mod-pagespeed-beta_current_i386.deb HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101
Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie:
NID=88=109tS20j8Ec0EQb5HzuNnbwtsl4sK64aakVRn-2qOe91Zv4e3st9lfyik8qQe7d12J4xBDCmdKMwiXY98a2dj4mOitaP4AbJV6fD7o9YKTxE7MziEkNCJ45GiDszPM8wXca5cuYK_gE4QVrU52VqzSa1IzmHbh_7XKsvYuDCSsgIMZaC8d4Fp01vrAU8dHPXGopVpBIxgpHwAjPv8NvLFM3e4y-um5A8umQ-GCFmpaaLd1_1jyafkNLTj-9Ix4hfsw;

Re: [squid-users] Caching http google deb files

2016-10-04 Thread Hardik Dangar
Wow, i couldn't think about that. google might need tracking data that
could be the reason they have blindly put vary * header. oh Irony, company
which talks to all of us on how to deliver content is trying to do such
thing.

I have looked at your patch but how do i enable that ? do i need to write
custom ACL ? i know i need to compile and reinstall after applying patch
but what do i need to do exactly in squid.conf file as looking at your
patch i am guessing i need to write archive acl or i am too naive to
understand C code :)

Also

reply_header_replace is any good for this ?


On Tue, Oct 4, 2016 at 7:47 PM, Amos Jeffries  wrote:

> On 5/10/2016 2:34 a.m., Hardik Dangar wrote:
> > Hey Amos,
> >
> > We have about 50 clients which downloads same google chrome update every
> 2
> > or 3 days means 2.4 gb. although response says vary but requested file is
> > same and all is downloaded via apt update.
> >
> > Is there any option just like ignore-no-store? I know i am asking for too
> > much but it seems very silly on google's part that they are sending very
> > header at a place where they shouldn't as no matter how you access those
> > url's you are only going to get those deb files.
>
>
> Some things G does only make sense whan you ignore all the PR about
> wanting to make the web more efficient and consider it's a company whose
> income is derived by recording data about peoples habits and activities.
> Caching can hide that info from them.
>
> >
> > can i hack squid source code to ignore very header ?
> >
>
> Google are explicitly saying the response changes. I suspect there is
> something involving Google account data being embeded in some of the
> downloads. For tracking, etc.
>
>
> If you are wanting to test it I have added a patch to
>  that should implement
> archival of responses where the ACLs match. It is completely untested by
> me beyond building, so YMMV.
>
> Amos
>
>
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-04 Thread Amos Jeffries
On 5/10/2016 2:34 a.m., Hardik Dangar wrote:
> Hey Amos,
> 
> We have about 50 clients which downloads same google chrome update every 2
> or 3 days means 2.4 gb. although response says vary but requested file is
> same and all is downloaded via apt update.
> 
> Is there any option just like ignore-no-store? I know i am asking for too
> much but it seems very silly on google's part that they are sending very
> header at a place where they shouldn't as no matter how you access those
> url's you are only going to get those deb files.


Some things G does only make sense whan you ignore all the PR about
wanting to make the web more efficient and consider it's a company whose
income is derived by recording data about peoples habits and activities.
Caching can hide that info from them.

> 
> can i hack squid source code to ignore very header ?
> 

Google are explicitly saying the response changes. I suspect there is
something involving Google account data being embeded in some of the
downloads. For tracking, etc.


If you are wanting to test it I have added a patch to
 that should implement
archival of responses where the ACLs match. It is completely untested by
me beyond building, so YMMV.

Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-04 Thread Hardik Dangar
Hey Amos,

after referring to one of your old posts i found, we can use

reply_header_replace

to replace headers. Is it possible to replace vary * header  with something
appropriate?

or

i need to look at squid's source code to ignore vary header and recompile ?



On Tue, Oct 4, 2016 at 7:04 PM, Hardik Dangar 
wrote:

> Hey Amos,
>
> We have about 50 clients which downloads same google chrome update every 2
> or 3 days means 2.4 gb. although response says vary but requested file is
> same and all is downloaded via apt update.
>
> Is there any option just like ignore-no-store? I know i am asking for too
> much but it seems very silly on google's part that they are sending very
> header at a place where they shouldn't as no matter how you access those
> url's you are only going to get those deb files.
>
> can i hack squid source code to ignore very header ?
>
>
>
> On Tue, Oct 4, 2016 at 6:51 PM, Amos Jeffries 
> wrote:
>
>> On 5/10/2016 2:05 a.m., Hardik Dangar wrote:
>> > Hello,
>> >
>> > I am trying to cache following deb files as its most requested file in
>> > network. ( google chrome almost every few days many clients update it ).
>> >
>> > http://dl.google.com/linux/direct/google-chrome-stable_curre
>> nt_amd64.deb
>> > http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_
>> current_i386.deb
>> >
>> > Response headers for both contains Last modified date which is 10 to 15
>> > days old but squid does not seem to cache it somehow. here is sample
>> > response header for one of the file,
>> >
>> > HTTP Response Header
>> >
>> > Status: HTTP/1.1 200 OK
>> > Accept-Ranges: bytes
>> > Content-Length: 6662208
>> > Content-Type: application/x-debian-package
>> > Etag: "fa383"
>> > Last-Modified: Thu, 15 Sep 2016 19:24:00 GMT
>> > Server: downloads
>> > Vary: *
>>
>> The Vary header says that this response is just one of many that can
>> happen for this URL.
>>
>> The "*" in that header says that the way to determine which the clietn
>> gets is based on something no proxy can ever do. Thus no cache can ever
>> re-use any content it wanted to store. Making any attempts to store it a
>> pointless waste of CPU time, disk and memory space that could better be
>> used by some other more useful object. Squid will not ever cache these
>> responses.
>>
>> (Thank you for the well written request for help anyhow.)
>>
>> Amos
>>
>> ___
>> squid-users mailing list
>> squid-users@lists.squid-cache.org
>> http://lists.squid-cache.org/listinfo/squid-users
>>
>
>
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-04 Thread Hardik Dangar
Hey Amos,

We have about 50 clients which downloads same google chrome update every 2
or 3 days means 2.4 gb. although response says vary but requested file is
same and all is downloaded via apt update.

Is there any option just like ignore-no-store? I know i am asking for too
much but it seems very silly on google's part that they are sending very
header at a place where they shouldn't as no matter how you access those
url's you are only going to get those deb files.

can i hack squid source code to ignore very header ?



On Tue, Oct 4, 2016 at 6:51 PM, Amos Jeffries  wrote:

> On 5/10/2016 2:05 a.m., Hardik Dangar wrote:
> > Hello,
> >
> > I am trying to cache following deb files as its most requested file in
> > network. ( google chrome almost every few days many clients update it ).
> >
> > http://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
> > http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-
> beta_current_i386.deb
> >
> > Response headers for both contains Last modified date which is 10 to 15
> > days old but squid does not seem to cache it somehow. here is sample
> > response header for one of the file,
> >
> > HTTP Response Header
> >
> > Status: HTTP/1.1 200 OK
> > Accept-Ranges: bytes
> > Content-Length: 6662208
> > Content-Type: application/x-debian-package
> > Etag: "fa383"
> > Last-Modified: Thu, 15 Sep 2016 19:24:00 GMT
> > Server: downloads
> > Vary: *
>
> The Vary header says that this response is just one of many that can
> happen for this URL.
>
> The "*" in that header says that the way to determine which the clietn
> gets is based on something no proxy can ever do. Thus no cache can ever
> re-use any content it wanted to store. Making any attempts to store it a
> pointless waste of CPU time, disk and memory space that could better be
> used by some other more useful object. Squid will not ever cache these
> responses.
>
> (Thank you for the well written request for help anyhow.)
>
> Amos
>
> ___
> squid-users mailing list
> squid-users@lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
>
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Caching http google deb files

2016-10-04 Thread Amos Jeffries
On 5/10/2016 2:05 a.m., Hardik Dangar wrote:
> Hello,
> 
> I am trying to cache following deb files as its most requested file in
> network. ( google chrome almost every few days many clients update it ).
> 
> http://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
> http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
> 
> Response headers for both contains Last modified date which is 10 to 15
> days old but squid does not seem to cache it somehow. here is sample
> response header for one of the file,
> 
> HTTP Response Header
> 
> Status: HTTP/1.1 200 OK
> Accept-Ranges: bytes
> Content-Length: 6662208
> Content-Type: application/x-debian-package
> Etag: "fa383"
> Last-Modified: Thu, 15 Sep 2016 19:24:00 GMT
> Server: downloads
> Vary: *

The Vary header says that this response is just one of many that can
happen for this URL.

The "*" in that header says that the way to determine which the clietn
gets is based on something no proxy can ever do. Thus no cache can ever
re-use any content it wanted to store. Making any attempts to store it a
pointless waste of CPU time, disk and memory space that could better be
used by some other more useful object. Squid will not ever cache these
responses.

(Thank you for the well written request for help anyhow.)

Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users