Soundex is not enough. We went through metaphone and
double-metaphone as well. The last showed the best
performance when combined with simple ways to reduce
the search space (e.g., names that start with the same
alphabet).
But it still had too many false positives and negatives. We ended up
using
look at : https://world.openfoodfacts.org/ .
>
> -Konar
>
> On Thu, Aug 31, 2017 at 5:16 AM, Venkata Pingali <ping...@gmail.com>
> wrote:
>
>> Hi!
>>
>> Is anyone aware of any public datasets on groceries and what
>> they contain?
>>
>> thanks!
>> -Venka
Hi!
Is anyone aware of any public datasets on groceries and what
they contain?
thanks!
-Venkata
--
Datameet is a community of Data Science enthusiasts in India. Know more about
us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups
estment | United Way Mumbai
>
> “Everyone You Meet has something Valuable to Teach You”.
>
> -------
>
> On Wed, Feb 15, 2017 at 2:49 PM, Venkata Pingali <ping...@gmail.com>
> wrote:
>
>> Usually
Usually the state election commission's web site has
basic data (winner, loser, votes).
Here is one for Maharashtra:
https://mahasec.maharashtra.gov.in/Site/Home/Index.aspx
On Wed, Feb 15, 2017 at 2:25 PM, Kedar Annam wrote:
> Hey Pratap,
>
> Thankyou very much for
The electoral list PDF usually has a hand drawn map. But it is not worth
the effort (i was involved in such an effort in the past). The boundaries
of the booth change every six months and that too in a random fashion.
Further the map is often out of sync with the actual content of the
voterlist.
I looked at this site some months back. Notes from my investigation then:
1. The content/jsons are being served by commercial GIS operated by BSNL
2. The content is coming back split across many JSONs (50-60) with encoded
URLs
3. API appeared stateful (URLs kept changing)
This, I concluded,
, 2016 at 1:30 PM, Venkata Pingali <ping...@gmail.com> wrote:
> Hi!
>
> I have been working on an opensource project to manage
> datasets called dgit. It has reached alpha stage. See the text
> below for details.
>
> dgit's goal is to enable more structured and predict
Hi!
I have been working on an opensource project to manage
datasets called dgit. It has reached alpha stage. See the text
below for details.
dgit's goal is to enable more structured and predictable data science
process where you are able to answer questions like:
(a) Lineage/Auditability: Where
At my firm (fourthlion) we did a bit of that - discovering the boundaries
of the ward/other
geography and trace population growth - using the voter list for the
2009-2014 time period
and for one constituency.
It is a cumbersome process for multiple technical reasons including
free text fields,
Couple of thoughts based on my experience with voter lists:
1. The value of the names (for profiling population clusters) increase
with granularity. Would recommend sharing at state level. You could
potentially annotate with variables (e.g., abstract region - north/south)
to
make it little more
, 2014 at 2:24 PM, Venkata Pingali ping...@gmail.com
wrote:
I have been working on PDF extraction. I find that PDF
combines 'what' (text itself) with 'how' (transformations,
presentation). The table that we see if often just a collection
of lines and rectangles put together in an adhoc fashion
I have been working on PDF extraction. I find that PDF
combines 'what' (text itself) with 'how' (transformations,
presentation). The table that we see if often just a collection
of lines and rectangles put together in an adhoc fashion.
It could be due to pdf generator libraries themselves. It
and
will be completed in two weeks or so. So we can't change the questionnaire
now, unfortunately. My intent here is to get help in figuring out how the
data we collect can be used. Thanks!
Kishore.
---
www.dakshindia.org
On Mon, Feb 10, 2014 at 12:35 PM, Venkata Pingali ping
Couple of thoughts:
1. We can ask about origin/hometown. This can help us understand
urbanization process.
2. Will the data be public information? Who is the sponsor
of this initiative?
On Mon, Feb 10, 2014 at 11:06 AM, Kishore (Narasimhan) Mandyam
kish...@dakshindia.org wrote:
Daksh is
will be published beginning two
weeks from now.
On Monday, February 10, 2014 12:14:35 PM UTC+5:30, Venkata Pingali wrote:
Couple of thoughts:
1. We can ask about origin/hometown. This can help us understand
urbanization process.
2. Will the data be public information? Who is the sponsor
Hi!
I am looking for archives for the last 20 years of major urban
newspapers such as Times of India.
Are you aware of any archives (free or paid)? I browsed through
online archives where available such as Hindu, ToI but need them
in a more accessible form e.g., API.
thanks!
-Venkata
--
For
Two sources:
1. Shunglu and Chaturvedi committee reports on
planning commission site.
2. State and central electricity regulatory commissions
(CERC for central level, MERC for Maharashtra etc)
-Venkata
On Thu, Nov 28, 2013 at 4:59 PM, Naveen Gattu naveen.ga...@gramener.comwrote:
HI -
I broadly agree. Technology is the easy part. I would think in terms
of architecture of coordination. Let me comment on one related
aspect.
Finally, there is the question of what incentive there is for
third-party developers to build such an API?
Surprisingly enough, it doesnt have to be
Today's press release.
http://pib.nic.in/newsite/PrintRelease.aspx?relid=88037
Minister of Water Resources and Parliamentary Affairs, Shri Pawan
Kumar Bansal released Atlas for six states viz Kerala, Tamil Nadu,
Karnataka, Chhattisgarh, Himachal Pradesh and Meghalaya in New Delhi
today.
20 matches
Mail list logo