Re: how to best organise a large community "important" books library?
On Sat, May 27, 2017 at 01:23:41PM +1000, James A. Donald wrote: > On 2017-05-17 17:08, Zenaan Harkness wrote: > >Has anyone done anything like this, and if so, how did you solve it? > > > >(Medium term, the problem begs for a git-like addressed, git-like P2P > >addressable/verifiable "content management" solution; e.g. if I have > >a random collection of say 10K books, it would be nice to be able to > >say something like: > > git library init > > > > # add my books: > > git add . > > git commit > > > > git library sort categories > > git library add index1 # create an "indexing" branch > > git commit > > > > # add some upstream/p2p libraries for indexing/search/federation: > > git library index add upstream:uni.berkely > > git library p2p add i2p.tunnel:myfriend.SHA > > git library full-index pull myfriend.SHA > > git library math-index pull myfriend.SHA > > git library book:SHA pull myfriend.SHA > > This is the problem of clustering in groups of enormously high dimension, > which is a well studied problem in AI. > > Take each substring of six words or less, that does not contain a full stop > internally. The substring may contain a > start of sentence marker at the beginning, and or an end of sentence marker > at the end. > > Each substring constitutes a vector in a space of enormously high dimension, > the space of all possible strings of > moderate length. > > Each such vector is randomly but deterministically mapped to a mere hundred > or so dimensions, to a space of moderately > high dimension, by randomly giving each coordinate the value of plus one, > minus one, or zero. > > Two identical substrings in two separate documents will be mapped to the same > vector in the space of moderate > dimension. > > Each document is then given a position in the space of moderately high > dimension by summing all the vectors, and then > normalizing the resulting vector to length one. > > Thus each document is assigned a position on the hypersphere in the space of > moderately high dimension. > > If two different documents contain some identical substrings, will tend to be > closer together. > > We then assign a document to its closest group of documents, and the closest > subgroup within that group, and the > closest sub sub group. Interesting. Algorithmic auto-categorizing. Sounds like it has potential to create at least one useful index. The main bit I don't like about Library of Congress, Dewey and other "physical library" categorization schemes is that they are evidently "optimized" for physical storage, and so they arbitrarily group categories which are not directly related. But even related categories might as well be "different categories" in the digital world - an extra folder/dir is relatively inexpensive compared with "another physical [sub-]shelf and labelling which we want to be relatively stable --in physical space-- over time, in order to minimize shuffling/ reorganizing of the physical storage of books (/maps /etc)". Categories (and sub-, sub-sub- etc) are definitely useful to folks, and ACM, some maths journals group etc, have typically created 4-levels of categories + sub categories --just for their field--. Again, we have no physical limits in virtual categorization space, and to top it off, with something git-like we can have as many categorizations (directory hierarchies) as people find useful - so some auto-algo thing as you suggest, LOC, Dewey, and "no useful sub-category excluded" might all be wanted by different people - so a content item exists, and is GUID addressed, and then one or more indexes/ folder hierarchies overlay on top of this. And indeed one content item such as a book may well be appropriately included in more than one category/ s-category/ ss-category/ sss-category, --in a single chosen hiearchy--, in any particular library.
NSA's illegal surveillance of Americans
http://www.nationalreview.com/article/447973/nsa-illegal-surveillance-americans-obama-administration-abuse-fisa-court-response [partial quote follows] The NSA intentionally and routinely intercepted communications of American citizens in violation of the Constitution. During the Obama years, the National Security Agency intentionally and routinely intercepted and reviewed communications of American citizens in violation of the Constitution and of court-ordered guidelines implemented pursuant to federal law. The unlawful surveillance appears to have been a massive abuse of the government’s foreign-intelligence-collection authority, carried out for the purpose of monitoring the communications of Americans in the United States. While aware that it was going on for an extensive period of time, the administration failed to disclose its unlawful surveillance of Americans until late October 2016, when the administration was winding down and the NSA needed to meet a court deadline in order to renew various surveillance authorities under the Foreign Intelligence Surveillance Act (FISA). The administration’s stonewalling about the scope of the violation induced an exasperated Foreign Intelligence Surveillance Court to accuse the NSA of “an institutional lack of candor” in connection with what the court described as “a very serious Fourth Amendment issue.” (The court is the federal tribunal created in 1978 by FISA; it is often referred to as a “secret court” because proceedings before it are classified and ex parte — meaning only the Justice Department appears before the court.) The FISA-court opinion is now public, available here. The unlawful surveillance was first exposed in a report at Circa by John Solomon and Sara Carter, who have also gotten access to internal, classified reports. The story was also covered extensively Wednesday evening by James Rosen and Bret Baier on Fox News’s Special Report. According to the internal reports reviewed by Solomon and Carter, the illegal surveillance may involve more than 5 percent of NSA searches of databases derived from what is called “upstream” collection of Internet communications. As the FISA court explains, upstream collection refers to the interception of communications “as they transit the facilities of an Internet backbone carrier.” These are the data routes between computer networks. The routes are hosted by government, academic, commercial, and similar high-capacity network centers, and they facilitate the global, international exchange of Internet traffic. Upstream collection from the Internet’s “backbone,” which accounts for about 9 percent of the NSA’s collection haul (a massive amount of communications), is distinguished from interception of communications from more familiar Internet service providers. Upstream collection is a vital tool for gathering intelligence against foreign threats to the United States. It is, of course, on foreign intelligence targets — non-U.S. persons situated outside the U.S. — that the NSA and CIA are supposed to focus. Foreign agents operating inside the U.S. are mainly the purview of the FBI, which conducts surveillance of their communications through warrants from the FISA court — individualized warrants based on probable cause that a specific person is acting as an agent of a foreign power. The NSA conducts vacuum intelligence-collection under a different section of FISA — section 702. It is inevitable that these section 702 surveillance authorities will incidentally intercept the communications of Americans inside the United States if those Americans are communicating with the foreign target. This does not raise serious Fourth Amendment concerns; after all, non-targeted Americans are intercepted all the time in traditional criminal wiretaps because they call, or are called by, the target. But FISA surveillance is more controversial than criminal surveillance because the government does not have to show probable cause of a crime — and when the targets are foreigners outside the U.S., the government does not have to make any showing; it may target if it has a legitimate foreign-intelligence purpose, which is really not much of a hurdle at all. So, as noted in coverage of the Obama administration’s monitoring of Trump-campaign officials, FISA section 702 provides some privacy protection for Americans: The FISA court orders “minimization” procedures, which require any incidentally intercepted American’s identity to be “masked.” That is, the NSA must sanitize the raw data by concealing the identity of the American. Only the “masked” version of the communication is provided to other U.S. intelligence agencies for purposes of generating reports and analyses. As I have previously explained, however, this system relies on the good faith of government officials in respecting privacy: There are gaping loopholes that permit American identities to be unmasked if, for example, th
Re: how to best organise a large community "important" books library?
On 2017-05-17 17:08, Zenaan Harkness wrote: Has anyone done anything like this, and if so, how did you solve it? (Medium term, the problem begs for a git-like addressed, git-like P2P addressable/verifiable "content management" solution; e.g. if I have a random collection of say 10K books, it would be nice to be able to say something like: git library init # add my books: git add . git commit git library sort categories git library add index1 # create an "indexing" branch git commit # add some upstream/p2p libraries for indexing/search/federation: git library index add upstream:uni.berkely git library p2p add i2p.tunnel:myfriend.SHA git library full-index pull myfriend.SHA git library math-index pull myfriend.SHA git library book:SHA pull myfriend.SHA This is the problem of clustering in groups of enormously high dimension, which is a well studied problem in AI. Take each substring of six words or less, that does not contain a full stop internally. The substring may contain a start of sentence marker at the beginning, and or an end of sentence marker at the end. Each substring constitutes a vector in a space of enormously high dimension, the space of all possible strings of moderate length. Each such vector is randomly but deterministically mapped to a mere hundred or so dimensions, to a space of moderately high dimension, by randomly giving each coordinate the value of plus one, minus one, or zero. Two identical substrings in two separate documents will be mapped to the same vector in the space of moderate dimension. Each document is then given a position in the space of moderately high dimension by summing all the vectors, and then normalizing the resulting vector to length one. Thus each document is assigned a position on the hypersphere in the space of moderately high dimension. If two different documents contain some identical substrings, will tend to be closer together. We then assign a document to its closest group of documents, and the closest subgroup within that group, and the closest sub sub group.
Re: PQ Crypto - 50 cracked up Qbits online within 1 year, NIST PQC Competition, etc
On Fri, 26 May 2017 00:12:49 -0400 grarpamp wrote: > https://motherboard.vice.com/en_us/article/ibm-17-qubit-quantum-processor-computer-google > https://www.research.ibm.com/ibm-q/ > IBM Fronts at least 17 Q-bits to the World's Private Buyers, > 50 rough Q-Bits by Many Entities within 1 Year so it's time to start using one time pads
Re: For Your Eyes Only...
On 05/26/2017 08:04 AM, Mirimir wrote: > On 05/26/2017 03:30 AM, Razer wrote: > > > >> http://getprsm.com/ > I wonder why they didn't use getprism.com instead. > > Maybe because it's been squatted: > > https://www.hugedomains.com/domain_profile.cfm?d=getprism&e=com > > Only $2,295 :) > > DomainTools want $ to see the domain history... Nah! But it has some history. SO I suspect it's not a moneymaker. Btw, that site is pwned by @datacoup who WOULD like people to volunteer their metadataz https://twitter.com/datacoup https://datacoup.com/ After all, why bother with BTC when you can gamble ur metadataz away: > Unlock the Value of Your Personal Data > Introducing the world's first personal data marketplace Rr
Re: For Your Eyes Only...
On 05/26/2017 03:30 AM, Razer wrote: > http://getprsm.com/ I wonder why they didn't use getprism.com instead. Maybe because it's been squatted: https://www.hugedomains.com/domain_profile.cfm?d=getprism&e=com Only $2,295 :)
Re: For Your Eyes Only...
On 05/25/2017 05:29 PM, Cecilia Tanaka wrote: > > Razer, did you notice the fact that I read the books when I had less > than half of the age of your youngest son? Son? You [selector]-ed the wrong person. http://getprsm.com/ Introducing a brand new way to share everything. No ads, ever. Share your content without ever being interrupted again. Unlimited Storage With the world's largest data center, share endlessly. 320 million strong You'll find every person you've ever known. Even grandma. No matter where you go, there it goes. Don't ever worry about not sharing again. Purchases Internet Searches Email Blog Posts TV Shows Watched Photos Uploaded Locations Phone Calls Videos Watched Texts Social Media And More! Instantly upload trillions of megabytes of data. Really fast computers Our Titan Supercomputer is capable of handling one quadrillion requests per second. Really big computers Our datacenter can store up to 5 zettabytes of information. Key Partners... http://getprsm.com/
Re: For Your Eyes Only...
> On May 25, 2017, at 2:28 AM, Steve Kinney wrote: > > > >> On 05/24/2017 08:44 PM, Razer wrote: >> >> Ps. I wouldn't suppose a single one of you has ever actually read one of >> Fleming's books. > > Only just all of them, even Chitty Chitty Bang Bang. :D > > "Once is misfortune, twice is happenstance, three times is enemy action." > > The first Bond flick was OK, but alas... Eon went the way of maximum > marketing and before it was over with, selling the Playboy Lifestyle was > the whole purpose of making Bond movies. If anyone here digs spy > fiction at all, The Night Manager is well worth seeing: It takes a > miniseries to tell a LeCarre story. > > But none of the above is a patch on L. Fletcher Prouty's magnum opiate, > The Secret Team. > > :o) > > I've actually never read Fleming, just sort of assumed it was way cheesy based on the movies? I have read most of LeCarre's stuff and a few other spy genre guys like Len Deighton and Frederick Forsyth... The Karla trilogy is the best, for my money. >