Re: [Sks-devel] Excessive use of /var/lib/sks/DB/log.*

2019-02-08 Thread John Zaitseff
Hi, all,

> In fact... I think that The Debian Way™ would be to have [the
> DB_CONFIG file] in /etc/sks, with a message on top clearly stating
> it should be linked from /var/lib/sks/DB (as we Debian people are
> often too lazy to look up configuration details in our software
> and expect everything to be in /etc) 

That is indeed how I set up my own system: /etc/sks/DB_CONFIG is the
actual config file, and /var/lib/sks/DB/DB_CONFIG and
/var/lib/sks/PTree/DB_CONFIG are symlinks to it.

> > If you're using a debian system, please compare
> > /usr/share/doc/sks/sampleConfig/DB_CONFIG with
> > /var/lib/sks/DB/DB_CONFIG

I overwrote my DB_CONFIG file back in September 2018.  I changed

  set_lock_timeout  1000
  set_txn_timeout   1000

to

  set_lock_timeout  1000
  set_txn_timeout   500

I did not notice any negative effects, but, by the same token, I was
still getting "add_keys_merge failed: Eventloop.SigAlarm" and "Key
addition failed: Eventloop.SigAlarm" in my log files.  Changing
/etc/sks/sksconf to include the following lines has completely
stopped those events from occurring (I made the change five days
ago):

  pagesize:  32
  ptree_pagesize:16
  command_timeout:   600
  max_recover:   150

I fear, however, that increasing the timeouts simply pushes the
problem slightly further down the track...

Yours truly,

John Zaitseff

-- 
John Zaitseff   ,--_|\The ZAP Group
Telephone: +61 2 9643 7737 /  \   Sydney, Australia
Email: j.zaits...@zap.org.au   \_,--._*   https://www.zap.org.au/
 v

___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel


Re: [Sks-devel] Excessive use of /var/lib/sks/DB/log.*

2019-02-08 Thread Gunnar Wolf
Hey dkg!

> If you're using a debian system, please compare
> /usr/share/doc/sks/sampleConfig/DB_CONFIG with
> /var/lib/sks/DB/DB_CONFIG -- if your files differ, i'd be happy to help
> you figure out whether the problematic behavior you're seeing could be
> attributable to those differences.
> 
> If they don't differ (if you're seeing the problematic behavior using
> the sample DB_CONFIG, and *especially* if you fix your problem by
> deviating from the shipped sample), i'd like to know that too.

With hindsight, that was my mistake - When I rebuilt my server
about a month ago (after a DB corruption), I decided to keep my
installation and just delete /var/lib/sks/DB/*, rebuilding from
dumps. Of course, I blew the DB_CONFIG (which IMO has no business
there!)

In fact... I think that The Debian Way™ would be to have it in
/etc/sks, with a message on top clearly stating it should be linked
from /var/lib/sks/DB (as we Debian people are often too lazy to look
up configuration details in our software and expect everything to be
in /etc) 


signature.asc
Description: PGP signature
___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel


Re: [Sks-devel] Excessive use of /var/lib/sks/DB/log.*

2019-02-08 Thread Daniel Kahn Gillmor
On Wed 2019-02-06 20:27:28 -0800, Todd Fleisher wrote:
> This sounds like you are missing the recommended DB_CONFIG values to
> prevent your server from holding into those log files when an issue is
> encountered. As I recall, the fix is to start over from scratch and
> rebuild after first putting that file in place. It is covered in the
> list archives and I believe the source and packages ship with an
> example file that had the needed options.

If you're using a debian system, please compare
/usr/share/doc/sks/sampleConfig/DB_CONFIG with
/var/lib/sks/DB/DB_CONFIG -- if your files differ, i'd be happy to help
you figure out whether the problematic behavior you're seeing could be
attributable to those differences.

If they don't differ (if you're seeing the problematic behavior using
the sample DB_CONFIG, and *especially* if you fix your problem by
deviating from the shipped sample), i'd like to know that too.

--dkg


signature.asc
Description: PGP signature
___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel


Re: [Sks-devel] "SKS is effectively running as end-of-life software at this point"?

2019-02-08 Thread Daniel Kahn Gillmor
On Fri 2019-02-08 20:44:33 +, Andrew Gallagher wrote:
> Parse the syslogs of an old style SKS server, fetch any updated
> packets, filter them and submit to the new server.

ah yes, the SMOP (it's a "Simple Matter of Programming") argument :)

> Sync from the old network to the new one only needs to work in one
> direction.

thanks, this is a good insight.

   --dkg


signature.asc
Description: PGP signature
___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel


Re: [Sks-devel] Unusual traffic for key 0x69D2EAD9 and 0xB33B4659

2019-02-08 Thread Todd Fleisher
To follow up on this, after making the below changes while my main disk IO went 
down, my load average went up, memory usage went through the roof & swapping 
ensued. I increased the amount of memory assigned to each of my main nodes 
(those that gossip with the outside world) and it seems to be holding steady so 
far: https://imgur.com/a/b0S4Ui2 

I believe the nodes may have also been having issues gossiping as I saw 
outbound network traffic flatline during the same time periods: 
https://imgur.com/a/IEoLboM 

-T

> On Feb 6, 2019, at 4:21 PM, Todd Fleisher  wrote:
> 
> Signed PGP part
> I also applied these configuration options earlier today to all the servers 
> in 1 of my pools that was experiencing high IO load and repeated SigAlarms:
> command_timeout: 600
> wserver_timeout: 30
> max_recover: 150
> 
> And since then, everything has been quiet:
> 
> IO on the main node that gossips externally: https://i.imgur.com/ERgz0Xo.jpg 
> 
> 
> IO from another node in the same pool that gossips internally with the above 
> node: https://i.imgur.com/wsaxrJ5.jpg 
> 
> Hopefully this can help other operators keep things in better shape for the 
> time being.
> 
> -T
> 
> 
>> On Feb 6, 2019, at 3:22 AM, Rolf Wuerdemann > > wrote:
>> 
>> With your suggestions:
>> 
>> load average below 1
>> Traffic: ~150G/day
>> 
>> Best,
>> 
>>   Rolf
>> 
>> Am 2019-02-04 12:52, schrieb Martin Dobrev:
>>> Hi,
>>> I've spent last week trying to optimize configuration as much as
>>> possible. Following advise from a previous mail I've added:
 command_timeout: 600
 wserver_timeout: 30
 max_recover: 150
>>> to my sksconf and it seems this fixed majority of the EventLoop
>>> failures. I've added DB_CONFIG in KDB/PTree folders to get rid of DB
>>> archive logs that were causing plenty of IO load too.
>>> My clusters are now happily responding to queries and load-average is
>>> bellow one. Traffic wise things look better too, ~20GB/day.
>>> Kind regards,
>>> Martin Dobrev
>>> P.S. Adding/changing DB_CONFIG might cause an error in the databases
>>> that you can easily fix by running
>>> db_recover -e -v -h /{KDB,PTree}
>>> On 04/02/2019 09:49, Rolf Wuerdemann wrote:
 Hi,
 Don't get me wrong, but within three days I've got 450G traffic
 which can be assigned to sks by 99.9%. Estimated to 30 days this
 means 4.5T (which is in good agreement of your 2+T/Key for these
 two poison keys).
 With this amount of traffic and the possibility to get
 more of this keys (thus more traffic) every moment, I think it's
 only a question of time until the network with the current
 implementation will vanish. Traffic increased roughly a factor of
 300 (15G->4.5T) within twelve months, nodes within the network
 decreased by a factor of two at least for the same time.
 So: where to go and how?
 Just my 2ct,
 rowue
 Am 2019-01-30 22:09, schrieb Martin Dobrev:
 Hi,
 My observations so far show that both keys generate  2+ TB/month
 traffic on average for all my clustered nodes. I'm running nginx +
 Varnish in-memory cache tuned at 5 minutes TTL which gives plenty of
 CPU cycles for the never-ending EventLoop alarm loops. The latter
 cause load-average spikes of up to 10 with just 4 Docker containers
 running on a 12 core system.
 Don't get me wrong. The throttling penalty is something I'd
 swallow-up
 as long as we keep the network running.
 Regards,
 Martin
 keyserver.dobrev.eu  | pgp.dobrev.it 
 
  Original message 
 From: Kristian Fiskerstrand
 >>> >
 Date: 30/01/2019 20:18 (GMT+00:00)
 To: Shengjing Zhu mailto:zsj950...@gmail.com>>, 
 sks-devel@nongnu.org 
 Subject: Re: [Sks-devel] Unusual traffic for key 0x69D2EAD9 and
 0xB33B4659
 On 1/12/19 8:15 PM, Shengjing Zhu wrote:
 I think these requests are quite unusual.
 Does anyone know what happens to these two keys?
 Just to add a comment on this, adding a cache on the load-balancer
 is
 really a nice way to slow down hits on the underlying SKS nodes, I
 keep
 cache for 10 minutes in nginx, which really makes life more
 pleasant.
 --
 
 Kristian Fiskerstrand
 Blog: https://blog.sumptuouscapital.com 
 
 Twitter: @krifisk
 
 Public OpenPGP keyblock at hkp://pool.sks-keyservers.net 
 
 fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3
 
 "Action is the foundational key to all success"
 (Pablo Picasso)
 ___

Re: [Sks-devel] "SKS is effectively running as end-of-life software at this point"?

2019-02-08 Thread Andrew Gallagher


> On 8 Feb 2019, at 19:02, Daniel Kahn Gillmor  wrote:
> 
> Figuring out how to do the partial-sync for a limited time sounds
> difficult to me, and i wonder whether it might be better/faster/cheaper
> to just deploy such an update-only network, and don't bother with the
> partial sync.

Parse the syslogs of an old style SKS server, fetch any updated packets, filter 
them and submit to the new server. Sync from the old network to the new one 
only needs to work in one direction. 

A

___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel


Re: [Sks-devel] "SKS is effectively running as end-of-life software at this point"?

2019-02-08 Thread Daniel Kahn Gillmor
On Thu 2019-02-07 23:15:18 +0100, Kristian Fiskerstrand wrote:
> The current discussions we're having (e.g during OpenPGP email summit in
> brussels in october and lately on FOSDEM last weekend) is eventually not
> storing UIDs at all on the keyservers, but require the user to do key
> discovery through WKD, directly on a website or the like. This still
> allows using keyservers to distribute revocation certificates and
> updates to subkeys etc, but not as a discovery mechanism.

thanks for this summary, kristian.  it roughly matches my recollection
of these discussions as well.

It's conceivable that such a constrained updates-only-keyserver network
could also host self-signed user IDs, with reasonable constraints
(e.g. valid UTF-8 only, no more than 512 octets), but no third-party
certifications.  This would permit keyserver-based updates of
expirations, since primary key expiration timestamps are currently
stored in the self-signatures over the User IDs.

Alternately, we could encourage OpenPGP implementations to issue primary
key expiration timestamps as direct-key signatures, not involving a user
ID at all.

Critically, though, this updates-only keyserver network would *only*
permit retreival of keys by fingerprint, and would not provide a
web-based tool for browsing based on User ID, to avoid the usability
failures that we've seen attributed to SKS in the past.

Each node in this proposed updates-only network would also need to be
able to cryptographically verify the OpenPGP packets that it
synchronizes, and should reject packet that cannot be cryptographically
verified.

> Pool-wise it'd be setting up a separate keyserver network that  will
> gossip with the existing network for a time, with separate pool for the
> with-uid and without-uid servers, before the full switch is done...

Figuring out how to do the partial-sync for a limited time sounds
difficult to me, and i wonder whether it might be better/faster/cheaper
to just deploy such an update-only network, and don't bother with the
partial sync.

> The new network would be running on software replacing SKS, using more
> suited database backend that and multi-threaded implementation. The
> current disagreement are really with regards to whether this should be
> "validating keyservers" or not, and how such servers could interact with
> non-validating ones.

"validating keyservers" form an entirely distinct interaction model,
since they are focused on User IDs.  They should not be conflated with
this updates-only proposal.  There's no need to coordinate their
development.

--dkg

___
Sks-devel mailing list
Sks-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/sks-devel