On 13/10/16 18:50, Ellison Anne Williams wrote:
> The embedded lookup tables are computed Query side as they are specific to
> the (encrypted) query vectors. The 'embedded' part is key here - if you
> compute them in the Responder, then you have to repeat that computation
> each time you would like to run the query instead of just pulling the
> (one-time) pre-computed lookup table from the Query object.
Yet as we can see, the responder may choose:
- not to use a lookup table at all,
- generate it for this query, and keep it in memory ("embedded"),
- precompute the table and store it in HDFS,
- other ways to get the exp values we have not thought of yet.
So it is odd that the details of the implementation by which the
responder finds the exp values is decided by the querier.
p.s. it makes me a bit nervous, though without any concrete proof, for
the querier to be giving the responder pre-computed exponent values used
in compiling the encrypted response. What if the querier chooses to lie
through the expTable (e.g. always answer 1)? Can't they use that to
figure out details of the underlying data?
> Note that in Spark, there is an option to compute the lookup table in a
> distributed form (i.e. not embedded in the Query).
> Thus, computation of lookup tables can happen on the Responder side (there
> is such an implementation for Spark), but the embedded lookup table is a
> different animal.
Right, I am just thinking about the embedded lookup table right now.
> Make sense?
I agree that having the responder run the same query again is likely to
be a common case, and there should be some way to cache the working data
in case the query is seen again.
I'm not quite there yet that embedding it in the query is the right answer.
> On Wed, Oct 12, 2016 at 9:36 AM, Tim Ellison <t.p.elli...@gmail.com> wrote:
>> On 29/09/16 11:29, Ellison Anne Williams wrote:
>>> In general, I am in favor of an abstract class.
>>> However, note that in the distributed case, the 'table' is generated in a
>>> distributed fashion and then used as such too ('split' and distributed).
>>> FWIW - In preliminary testing, the lookup tables ended up not performing
>>> any better at scale than the local caching mechanism that is currently in
>>> place and used by default (in
>> I'm trying to figure out why the Query is responsible for maintaining
>> the expTable / expFile* info? These tables are only used by the
>> responders, so doesn't it make sense to move the logic over there?
>> The responders should decide whether they want to use caches to
>> calculate the response, not the person asking the query.