>> Working on the API and web UI next, then the p2p part of it. Feel free >> to submit any feature requests or have a play :-) >
Hi Matthew, Thank you very much for your reply and time spent thinking about all of the below. Much appreciated! > P2P sounds ripe for abuse by bad actors... A few scenarios: That's correct. I think the authz/authn issues have already been solved in other places. I'm thinking about things like signing up on StackOverflow or Reddit and what you can do the first time without any reputation etc. Similar to email. I was chatting to Justin Richer (https://www.se-radio.net/2019/08/episode-376-justin-richer-on-api-security-with-oauth-2/) about this last month: "I took a look at the peer project and it sounds interesting. A lot like BitTorrent’s protocol, but with the sharing at a higher level, it seems? So it might be worthwhile researching into how graph networks like that determine trustworthiness of nodes. Most of them have a kind of distributed consensus state that gets reached after some time, and so there’s no client authentication needed within the network itself because the clients will be identified by some ephemeral key and trusted based on actions instead of a pre-registration. Still, there are a few different efforts that are dealing with bridging registration type questions in the OAuth and related spaces. OAuth 2 assumes clients all have client IDs and they’re pre-registered. The Dynamic Registration spec (RFC7591) allows that registration to happen programmatically as a discrete pre-step, but it also allows the client to present a signed assertion (the software statement) that helps the client claim that it is legitimate. An extension to OpenID Connect recently introduced the idea of the client sending a “registration” object with the initial request to the AS, to provide a drive-by registration in a single step. The client would get a client ID out the other end if it’s successful. I haven’t seen this applied in practice anywhere yet. The OpenID SIOP group has been discussing overloading the Client ID parameter itself to contain semantic information allowing the client to send an identifier that the AS could use to fetch client registration information. This subverts the idea of the client ID as understood by most implementations (it’s now client-supplied and meaningful instead of AS-supplied and opaque to the client). The frontrunner here is using DIDs and DID documents to convey stuff, but that’s mostly because that’s the tech this crowd currently likes a lot. In GNAP we’ve inverted the registration requirement a bit — the protocol’s set up to assume that you’re coming in with no previous registration, so you can send any client information necessary during the initial request, and that initial request always happens the same way regardless of how the interactions and other next steps go. But there’s an optimization for cases when you :do: have a pre-registered client, so that you can send the ID instead of the client info itself. I’m not sure how much of that actually applies to what you’re working on, based on my very limited understanding of what you’re doing, but I hope it’s helpful. Good luck with the project!" > 1. You only get the list if you provide a list of your own. Therefore, > someone adds some random IPs into a list, then knows what the state of the > network is, and as soon as the IP they're using appears on the list, they > stop using it until it drops back off. True. The IP address harvesting is one thing, but stage two when they actively try to make phone calls will always happen as it's too lucrative not to. That's the data I'm also interested in getting and sharing. Folks that run the nodes will be able to add their own phone number allocations and I'm thinking about using the various RIR feeds etc. RPKI. Again, I think this is a solved problem, I just need to find the right place to look. > 2. IPv6 means presumably blocking /64s at a time rather than individual > addresses, I don't know if privacy addressing etc is a thing in the telephony > market, where addresses rotate after a while? Not sure yet. > 3. CGNAT means you might affect more than you intended, and the problem will > only get worse over time. How is this currently handled with an infected PC behind CGNAT? That's a solved problem? > 4. If the source IP is just a compromised device, you've booted that person > (who may be an entire office) off SIP for a week or more, even if they fix > the issue. You don't need to block them, but depending on what the ITSP wants to do, they could get limited service etc. > Additionally, from a feature POV: > > 1. BGP sounds like a needless over-complication. Surely just some iptables > (realistically: nftables) hooks would do? Both. Depends on how you run your nodes. The BGP part I just like the thought of and want to explore. > 2. A user is never going to pay for all data collected if it's available via > P2P, and if it isn't all on P2P, then why would anyone use the P2P version? > Not to mention it's once again a GDPR minefield. I think IP addresses and GDPR is a solved problem? > 3. "Small binary size for IoT usage" -- presumably this is going either on > your voice gateway or being scraped from logs, it's way out of scope for IoT? Maybe some form of it lives in a device/gateway. I was looking at Juniper JET and MQTT types things for the data sharing part https://www.juniper.net/documentation/us/en/software/junos/jet-developer/index.html I'm just thinking about devices like the RIPE Atlas probes and small devices that can just sit doing this - https://atlas.ripe.net/probes/ > Might I suggest just implementing a DNSBL or similar? Would be a lot simpler, > allows for local caching, and it's very easy to extend -- allowing AXFR/IXFR > if you wanted users to be able to scrape the entire list, or just with a > pointer to an HTTP(S) URL that the zone can be downloaded. You can even parse > submitted data and maybe even do a probe of your own or correlate with other > submitted reports so that you only implement when multiple submissions from > different locations report the same thing. Sure, it's not the distributed > content hosting model you're looking for, but otherwise there's no stopping > it from being abused. > This is a great idea and worth exploring. It might be useful for the bootstrapping/authz/authn part of the P2P. I'm really, really trying to not have to run anything, but everyone I speak to says the problem just gets pushed further up: "SentryPeer looks cool. I guess its p2p discovery mechanism primarily must work over WAN while Zyre emphasizes LAN (via UDP broadcast) and it works well there. Zyre can instead connect via TCP to some "well known" zgossip servers. This works on the WAN if the zgossip servers are accessible by the peer. Of course, it also adds some amount of unwanted centralization." More here: https://github.com/zeromq/zyre/issues/701#issuecomment-947808963 > I used to run a couple of nodes in the PGP keyserver ring (aka "SKS") and > it's amazing what things people will do to either be a nuisance or to show > how "smart" they are. I would *strongly* recommend speaking with more > operators first. I'm doing the p2p/sharing last after the API and web UI. I'm going to enjoy this problem! > Just my two pence worth. Maybe I'm wrong -- I've not done any SIP for a > decade, and certainly not at your scale. I'm very grateful. Thank you.
